2025 Poster "preference learning" Papers
13 papers found
Advancing LLM Reasoning Generalists with Preference Trees
Lifan Yuan, Ganqu Cui, Hanbin Wang et al.
ICLR 2025posterarXiv:2404.02078
179
citations
Bayesian Optimization with Preference Exploration using a Monotonic Neural Network Ensemble
Hanyang Wang, Juergen Branke, Matthias Poloczek
NeurIPS 2025posterarXiv:2501.18792
Diverse Preference Learning for Capabilities and Alignment
Stewart Slocum, Asher Parker-Sartori, Dylan Hadfield-Menell
ICLR 2025posterarXiv:2511.08594
21
citations
DSPO: Direct Score Preference Optimization for Diffusion Model Alignment
Huaisheng Zhu, Teng Xiao, Vasant Honavar
ICLR 2025poster
22
citations
Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections
Bo Wang, Qinyuan Cheng, Runyu Peng et al.
NeurIPS 2025posterarXiv:2507.00018
14
citations
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.
ICLR 2025posterarXiv:2410.17637
19
citations
Preference Learning with Lie Detectors can Induce Honesty or Evasion
Chris Cundy, Adam Gleave
NeurIPS 2025posterarXiv:2505.13787
4
citations
Scalable Valuation of Human Feedback through Provably Robust Model Alignment
Masahiro Fujisawa, Masaki Adachi, Michael A Osborne
NeurIPS 2025posterarXiv:2505.17859
1
citations
SELF-EVOLVED REWARD LEARNING FOR LLMS
Chenghua Huang, Zhizhen Fan, Lu Wang et al.
ICLR 2025posterarXiv:2411.00418
18
citations
Self-Refining Language Model Anonymizers via Adversarial Distillation
Kyuyoung Kim, Hyunjun Jeon, Jinwoo Shin
NeurIPS 2025posterarXiv:2506.01420
3
citations
Sparta Alignment: Collectively Aligning Multiple Language Models through Combat
Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.
NeurIPS 2025posterarXiv:2506.04721
3
citations
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin, Sadhika Malladi, Adithya Bhaskar et al.
ICLR 2025posterarXiv:2410.08847
47
citations
Variational Best-of-N Alignment
Afra Amini, Tim Vieira, Elliott Ash et al.
ICLR 2025posterarXiv:2407.06057
37
citations