2025 Poster "instruction following" Papers

18 papers found

Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models

Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux et al.

ICLR 2025posterarXiv:2410.18252
39
citations

Generalizing Verifiable Instruction Following

Valentina Pyatkin, Saumya Malik, Victoria Graf et al.

NEURIPS 2025posterarXiv:2507.02833
35
citations

HalLoc: Token-level Localization of Hallucinations for Vision Language Models

Eunkyu Park, Minyeong Kim, Gunhee Kim

CVPR 2025posterarXiv:2506.10286
3
citations

Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

Yulei Qin, Gang Li, Zongyi Li et al.

NEURIPS 2025posterarXiv:2506.01413
4
citations

Is In-Context Learning Sufficient for Instruction Following in LLMs?

Hao Zhao, Maksym Andriushchenko, francesco croce et al.

ICLR 2025posterarXiv:2405.19874
21
citations

Language-Image Models with 3D Understanding

Jang Hyun Cho, Boris Ivanovic, Yulong Cao et al.

ICLR 2025posterarXiv:2405.03685
27
citations

Language Imbalance Driven Rewarding for Multilingual Self-improving

Wen Yang, Junhong Wu, Chen Wang et al.

ICLR 2025posterarXiv:2410.08964
23
citations

Language Models Can Predict Their Own Behavior

Dhananjay Ashok, Jonathan May

NEURIPS 2025posterarXiv:2502.13329
5
citations

Learning to Instruct for Visual Instruction Tuning

Zhihan Zhou, Feng Hong, JIAAN LUO et al.

NEURIPS 2025posterarXiv:2503.22215
3
citations

Lookahead Routing for Large Language Models

Canbin Huang, Tianyuan Shi, Yuhua Zhu et al.

NEURIPS 2025posterarXiv:2510.19506

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Zhangchen Xu, Fengqing Jiang, Luyao Niu et al.

ICLR 2025posterarXiv:2406.08464
261
citations

Param$\Delta$ for Direct Mixing: Post-Train Large Language Model At Zero Cost

Sheng Cao, Mingrui Wu, Karthik Prasad et al.

ICLR 2025poster

SMoLoRA: Exploring and Defying Dual Catastrophic Forgetting in Continual Visual Instruction Tuning

Ziqi Wang, Chang Che, Qi Wang et al.

ICCV 2025posterarXiv:2411.13949
3
citations

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Jiale Cheng, Xiao Liu, Cunxiang Wang et al.

ICLR 2025posterarXiv:2412.11605
12
citations

Sparta Alignment: Collectively Aligning Multiple Language Models through Combat

Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.

NEURIPS 2025posterarXiv:2506.04721
3
citations

Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking

Benjamin Feuer, Micah Goldblum, Teresa Datta et al.

ICLR 2025posterarXiv:2409.15268
27
citations

Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers

Daniel Dsouza, Julia Kreutzer, Adrien Morisot et al.

NEURIPS 2025posterarXiv:2506.14702

Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning

Minheng Ni, YuTao Fan, Lei Zhang et al.

ICLR 2025posterarXiv:2410.03321
20
citations