2025 Poster "instruction following" Papers
18 papers found
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux et al.
Generalizing Verifiable Instruction Following
Valentina Pyatkin, Saumya Malik, Victoria Graf et al.
HalLoc: Token-level Localization of Hallucinations for Vision Language Models
Eunkyu Park, Minyeong Kim, Gunhee Kim
Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models
Yulei Qin, Gang Li, Zongyi Li et al.
Is In-Context Learning Sufficient for Instruction Following in LLMs?
Hao Zhao, Maksym Andriushchenko, francesco croce et al.
Language-Image Models with 3D Understanding
Jang Hyun Cho, Boris Ivanovic, Yulong Cao et al.
Language Imbalance Driven Rewarding for Multilingual Self-improving
Wen Yang, Junhong Wu, Chen Wang et al.
Language Models Can Predict Their Own Behavior
Dhananjay Ashok, Jonathan May
Learning to Instruct for Visual Instruction Tuning
Zhihan Zhou, Feng Hong, JIAAN LUO et al.
Lookahead Routing for Large Language Models
Canbin Huang, Tianyuan Shi, Yuhua Zhu et al.
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Zhangchen Xu, Fengqing Jiang, Luyao Niu et al.
Param$\Delta$ for Direct Mixing: Post-Train Large Language Model At Zero Cost
Sheng Cao, Mingrui Wu, Karthik Prasad et al.
SMoLoRA: Exploring and Defying Dual Catastrophic Forgetting in Continual Visual Instruction Tuning
Ziqi Wang, Chang Che, Qi Wang et al.
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Jiale Cheng, Xiao Liu, Cunxiang Wang et al.
Sparta Alignment: Collectively Aligning Multiple Language Models through Combat
Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.
Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking
Benjamin Feuer, Micah Goldblum, Teresa Datta et al.
Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers
Daniel Dsouza, Julia Kreutzer, Adrien Morisot et al.
Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning
Minheng Ni, YuTao Fan, Lei Zhang et al.