"supervised fine-tuning" Papers

14 papers found

Filters:supervised fine-tuning Clear all

Conference

AAAI 2025 (3,028)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NeurIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,140)oral (1,594)spotlight (1,421)highlight (975)

Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning

Wang Yang, Zirui Liu, Hongye Jin et al.

NeurIPS 2025posterarXiv:2505.17315

citations

The Promise of RL for Autoregressive Image Editing

Saba Ahmadi, Rabiul Awal, Ankur Sikarwar et al.

NeurIPS 2025posterarXiv:2508.01119

citations

Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties

Gouki Minegishi, Hiroki Furuta, Takeshi Kojima et al.

NeurIPS 2025posterarXiv:2506.05744

citations

Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning

Jiaru Zou, Yikun Ban, Zihao Li et al.

NeurIPS 2025spotlightarXiv:2505.16270

citations

WebDancer: Towards Autonomous Information Seeking Agency

Jialong Wu, Baixuan Li, Runnan Fang et al.

NeurIPS 2025posterarXiv:2505.22648

citations

Words or Vision: Do Vision-Language Models Have Blind Faith in Text?

Ailin Deng, Tri Cao, Zhirui Chen et al.

CVPR 2025posterarXiv:2503.02199

citations

Can AI Assistants Know What They Don't Know?

Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu et al.

ICML 2024poster

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Haoran Xu, Amr Sharaf, Yunmo Chen et al.

ICML 2024poster

Exploring the LLM Journey from Cognition to Expression with Linear Representations

Yuzi Yan, Jialian Li, YipinZhang et al.

ICML 2024poster

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

Le Yu, Bowen Yu, Haiyang Yu et al.

ICML 2024poster

Preference Ranking Optimization for Human Alignment

Feifan Song, Bowen Yu, Minghao Li et al.

AAAI 2024paperarXiv:2306.17492

334

citations

Privacy-Preserving Instructions for Aligning Large Language Models

Da Yu, Peter Kairouz, Sewoong Oh et al.

ICML 2024poster

Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment

Rui Yang, Xiaoman Pan, Feng Luo et al.

ICML 2024poster

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Zixiang Chen, Yihe Deng, Huizhuo Yuan et al.

ICML 2024poster