2025 "instruction following" Papers
10 papers found
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux et al.
ICLR 2025posterarXiv:2410.18252
39
citations
CoC-VLA: Delving into Adversarial Domain Transfer for Explainable Autonomous Driving via Chain-of-Causality Visual-Language-Action Model
Dapeng Zhang, Fei Shen, Rui Zhao et al.
NeurIPS 2025oralarXiv:2511.19914
Fixing It in Post: A Comparative Study of LLM Post-Training Data Quality and Model Performance
Aladin Djuhera, Swanand Kadhe, Syed Zawad et al.
NeurIPS 2025spotlightarXiv:2506.06522
Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models
Yulei Qin, Gang Li, Zongyi Li et al.
NeurIPS 2025posterarXiv:2506.01413
4
citations
Language Models Can Predict Their Own Behavior
Dhananjay Ashok, Jonathan May
NeurIPS 2025posterarXiv:2502.13329
5
citations
Learning to Instruct for Visual Instruction Tuning
Zhihan Zhou, Feng Hong, JIAAN LUO et al.
NeurIPS 2025posterarXiv:2503.22215
3
citations
Lookahead Routing for Large Language Models
Canbin Huang, Tianyuan Shi, Yuhua Zhu et al.
NeurIPS 2025posterarXiv:2510.19506
SMoLoRA: Exploring and Defying Dual Catastrophic Forgetting in Continual Visual Instruction Tuning
Ziqi Wang, Chang Che, Qi Wang et al.
ICCV 2025posterarXiv:2411.13949
3
citations
Sparta Alignment: Collectively Aligning Multiple Language Models through Combat
Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.
NeurIPS 2025posterarXiv:2506.04721
3
citations
WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch
Zimu Lu, Yunqiao Yang, Houxing Ren et al.
NeurIPS 2025oralarXiv:2505.03733
16
citations