2025 "instruction following" Papers

10 papers found

Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models

Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux et al.

ICLR 2025posterarXiv:2410.18252
39
citations

CoC-VLA: Delving into Adversarial Domain Transfer for Explainable Autonomous Driving via Chain-of-Causality Visual-Language-Action Model

Dapeng Zhang, Fei Shen, Rui Zhao et al.

NeurIPS 2025oralarXiv:2511.19914

Fixing It in Post: A Comparative Study of LLM Post-Training Data Quality and Model Performance

Aladin Djuhera, Swanand Kadhe, Syed Zawad et al.

NeurIPS 2025spotlightarXiv:2506.06522

Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

Yulei Qin, Gang Li, Zongyi Li et al.

NeurIPS 2025posterarXiv:2506.01413
4
citations

Language Models Can Predict Their Own Behavior

Dhananjay Ashok, Jonathan May

NeurIPS 2025posterarXiv:2502.13329
5
citations

Learning to Instruct for Visual Instruction Tuning

Zhihan Zhou, Feng Hong, JIAAN LUO et al.

NeurIPS 2025posterarXiv:2503.22215
3
citations

Lookahead Routing for Large Language Models

Canbin Huang, Tianyuan Shi, Yuhua Zhu et al.

NeurIPS 2025posterarXiv:2510.19506

SMoLoRA: Exploring and Defying Dual Catastrophic Forgetting in Continual Visual Instruction Tuning

Ziqi Wang, Chang Che, Qi Wang et al.

ICCV 2025posterarXiv:2411.13949
3
citations

Sparta Alignment: Collectively Aligning Multiple Language Models through Combat

Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.

NeurIPS 2025posterarXiv:2506.04721
3
citations

WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch

Zimu Lu, Yunqiao Yang, Houxing Ren et al.

NeurIPS 2025oralarXiv:2505.03733
16
citations