"supervised fine-tuning" Papers
14 papers found
Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning
Wang Yang, Zirui Liu, Hongye Jin et al.
NeurIPS 2025posterarXiv:2505.17315
3
citations
The Promise of RL for Autoregressive Image Editing
Saba Ahmadi, Rabiul Awal, Ankur Sikarwar et al.
NeurIPS 2025posterarXiv:2508.01119
2
citations
Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties
Gouki Minegishi, Hiroki Furuta, Takeshi Kojima et al.
NeurIPS 2025posterarXiv:2506.05744
13
citations
Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning
Jiaru Zou, Yikun Ban, Zihao Li et al.
NeurIPS 2025spotlightarXiv:2505.16270
10
citations
WebDancer: Towards Autonomous Information Seeking Agency
Jialong Wu, Baixuan Li, Runnan Fang et al.
NeurIPS 2025posterarXiv:2505.22648
81
citations
Words or Vision: Do Vision-Language Models Have Blind Faith in Text?
Ailin Deng, Tri Cao, Zhirui Chen et al.
CVPR 2025posterarXiv:2503.02199
33
citations
Can AI Assistants Know What They Don't Know?
Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu et al.
ICML 2024poster
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Haoran Xu, Amr Sharaf, Yunmo Chen et al.
ICML 2024poster
Exploring the LLM Journey from Cognition to Expression with Linear Representations
Yuzi Yan, Jialian Li, YipinZhang et al.
ICML 2024poster
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Le Yu, Bowen Yu, Haiyang Yu et al.
ICML 2024poster
Preference Ranking Optimization for Human Alignment
Feifan Song, Bowen Yu, Minghao Li et al.
AAAI 2024paperarXiv:2306.17492
334
citations
Privacy-Preserving Instructions for Aligning Large Language Models
Da Yu, Peter Kairouz, Sewoong Oh et al.
ICML 2024poster
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
Rui Yang, Xiaoman Pan, Feng Luo et al.
ICML 2024poster
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Zixiang Chen, Yihe Deng, Huizhuo Yuan et al.
ICML 2024poster