Poster "preference learning" Papers

23 papers found

Advancing LLM Reasoning Generalists with Preference Trees

Lifan Yuan, Ganqu Cui, Hanbin Wang et al.

ICLR 2025posterarXiv:2404.02078
179
citations

Bayesian Optimization with Preference Exploration using a Monotonic Neural Network Ensemble

Hanyang Wang, Juergen Branke, Matthias Poloczek

NeurIPS 2025posterarXiv:2501.18792

Diverse Preference Learning for Capabilities and Alignment

Stewart Slocum, Asher Parker-Sartori, Dylan Hadfield-Menell

ICLR 2025posterarXiv:2511.08594
21
citations

DSPO: Direct Score Preference Optimization for Diffusion Model Alignment

Huaisheng Zhu, Teng Xiao, Vasant Honavar

ICLR 2025poster
22
citations

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections

Bo Wang, Qinyuan Cheng, Runyu Peng et al.

NeurIPS 2025posterarXiv:2507.00018
14
citations

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.

ICLR 2025posterarXiv:2410.17637
19
citations

MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents

Junpeng Yue, Xinrun Xu, Börje F. Karlsson et al.

ICLR 2025posterarXiv:2410.03450
8
citations

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.

ICCV 2025posterarXiv:2507.21391
6
citations

Preference Learning with Lie Detectors can Induce Honesty or Evasion

Chris Cundy, Adam Gleave

NeurIPS 2025posterarXiv:2505.13787
4
citations

Scalable Valuation of Human Feedback through Provably Robust Model Alignment

Masahiro Fujisawa, Masaki Adachi, Michael A Osborne

NeurIPS 2025posterarXiv:2505.17859
1
citations

SELF-EVOLVED REWARD LEARNING FOR LLMS

Chenghua Huang, Zhizhen Fan, Lu Wang et al.

ICLR 2025posterarXiv:2411.00418
18
citations

Self-Refining Language Model Anonymizers via Adversarial Distillation

Kyuyoung Kim, Hyunjun Jeon, Jinwoo Shin

NeurIPS 2025posterarXiv:2506.01420
3
citations

Sparta Alignment: Collectively Aligning Multiple Language Models through Combat

Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.

NeurIPS 2025posterarXiv:2506.04721
3
citations

Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization

Noam Razin, Sadhika Malladi, Adithya Bhaskar et al.

ICLR 2025posterarXiv:2410.08847
47
citations

Variational Best-of-N Alignment

Afra Amini, Tim Vieira, Elliott Ash et al.

ICLR 2025posterarXiv:2407.06057
37
citations

Active Preference Learning for Large Language Models

William Muldrew, Peter Hayes, Mingtian Zhang et al.

ICML 2024poster

Feel-Good Thompson Sampling for Contextual Dueling Bandits

Xuheng Li, Heyang Zhao, Quanquan Gu

ICML 2024poster

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Kenneth Li, Samy Jelassi, Hugh Zhang et al.

ICML 2024poster

RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Harrison Lee, Samrat Phatale, Hassan Mansoor et al.

ICML 2024poster

RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback

Yufei Wang, Zhanyi Sun, Jesse Zhang et al.

ICML 2024poster

Self-Rewarding Language Models

Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho et al.

ICML 2024poster

Transforming and Combining Rewards for Aligning Large Language Models

Zihao Wang, Chirag Nagpal, Jonathan Berant et al.

ICML 2024poster

ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback

Ganqu Cui, Lifan Yuan, Ning Ding et al.

ICML 2024poster