"preference optimization" Papers

60 papers found • Page 2 of 2

Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model

Wenhong Zhu, Zhiwei He, Xiaofeng Wang et al.

ICLR 2025arXiv:2410.18640
15
citations

Weighted-Reward Preference Optimization for Implicit Model Fusion

Ziyi Yang, Fanqi Wan, Longguang Zhong et al.

ICLR 2025arXiv:2412.03187
14
citations

Zooming from Context to Cue: Hierarchical Preference Optimization for Multi-Image MLLMs

Xudong Li, Mengdan Zhang, Peixian Chen et al.

NEURIPS 2025arXiv:2505.22396
2
citations

A Minimaximalist Approach to Reinforcement Learning from Human Feedback

Gokul Swamy, Christoph Dann, Rahul Kidambi et al.

ICML 2024arXiv:2401.04056
139
citations

Can AI Assistants Know What They Don't Know?

Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu et al.

ICML 2024arXiv:2401.13275
43
citations

Generalized Preference Optimization: A Unified Approach to Offline Alignment

Yunhao Tang, Zhaohan Guo, Zeyu Zheng et al.

ICML 2024arXiv:2402.05749
150
citations

Human Alignment of Large Language Models through Online Preference Optimisation

Daniele Calandriello, Zhaohan Guo, REMI MUNOS et al.

ICML 2024arXiv:2403.08635
88
citations

Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models

Songtao Liu, Hanjun Dai, Yue Zhao et al.

ICML 2024arXiv:2406.02066
4
citations

RLVF: Learning from Verbal Feedback without Overgeneralization

Moritz Stephan, Alexander Khazatsky, Eric Mitchell et al.

ICML 2024arXiv:2402.10893
14
citations

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Zixiang Chen, Yihe Deng, Huizhuo Yuan et al.

ICML 2024arXiv:2401.01335
480
citations