"preference learning" Papers

50 papers found

Advancing LLM Reasoning Generalists with Preference Trees

Lifan Yuan, Ganqu Cui, Hanbin Wang et al.

ICLR 2025posterarXiv:2404.02078
179
citations

Align-DA: Align Score-based Atmospheric Data Assimilation with Multiple Preferences

Jing-An Sun, Hang Fan, Junchao Gong et al.

NEURIPS 2025posterarXiv:2505.22008
2
citations

Aligning Text-to-Image Diffusion Models to Human Preference by Classification

Longquan Dai, Xiaolu Wei, wang he et al.

NEURIPS 2025spotlight

Bandit Learning in Matching Markets with Indifference

Fang Kong, Jingqi Tang, Mingzhu Li et al.

ICLR 2025poster
1
citations

Bayesian Optimization with Preference Exploration using a Monotonic Neural Network Ensemble

Hanyang Wang, Juergen Branke, Matthias Poloczek

NEURIPS 2025posterarXiv:2501.18792

DeepHalo: A Neural Choice Model with Controllable Context Effects

Shuhan Zhang, Zhi Wang, Rui Gao et al.

NEURIPS 2025oralarXiv:2601.04616

Deep Value Benchmark: Measuring Whether Models Generalize Deep values or Shallow Preferences

Joshua Ashkinaze, Hua Shen, Saipranav Avula et al.

NEURIPS 2025oralarXiv:2511.02109

Diverse Preference Learning for Capabilities and Alignment

Stewart Slocum, Asher Parker-Sartori, Dylan Hadfield-Menell

ICLR 2025posterarXiv:2511.08594
21
citations

DSPO: Direct Score Preference Optimization for Diffusion Model Alignment

Huaisheng Zhu, Teng Xiao, Vasant Honavar

ICLR 2025poster
22
citations

Generative Adversarial Ranking Nets

Yinghua Yao, Yuangang Pan, Jing Li et al.

ICLR 2025poster
1
citations

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections

Bo Wang, Qinyuan Cheng, Runyu Peng et al.

NEURIPS 2025posterarXiv:2507.00018
14
citations

Improving Large Vision and Language Models by Learning from a Panel of Peers

Jefferson Hernandez, Jing Shi, Simon Jenni et al.

ICCV 2025posterarXiv:2509.01610
1
citations

Learning Preferences without Interaction for Cooperative AI: A Hybrid Offline-Online Approach

Haitong Ma, Haoran Yu, Haobo Fu et al.

NEURIPS 2025poster

LLaVA-Critic: Learning to Evaluate Multimodal Models

Tianyi Xiong, Xiyao Wang, Dong Guo et al.

CVPR 2025posterarXiv:2410.02712
99
citations

LoRe: Personalizing LLMs via Low-Rank Reward Modeling

Avinandan Bose, Zhihan Xiong, Yuejie Chi et al.

COLM 2025paperarXiv:2504.14439
10
citations

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.

ICLR 2025posterarXiv:2410.17637
19
citations

MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents

Junpeng Yue, Xinrun Xu, Börje F. Karlsson et al.

ICLR 2025posterarXiv:2410.03450
8
citations

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.

ICCV 2025posterarXiv:2507.21391
7
citations

NaDRO: Leveraging Dual-Reward Strategies for LLMs Training on Noisy Data

Haolong Qian, Xianliang Yang, Ling Zhang et al.

NEURIPS 2025poster

Online-to-Offline RL for Agent Alignment

Xu Liu, Haobo Fu, Stefano V. Albrecht et al.

ICLR 2025poster

Predictive Preference Learning from Human Interventions

Haoyuan Cai, Zhenghao (Mark) Peng, Bolei Zhou

NEURIPS 2025spotlightarXiv:2510.01545

Preference Learning with Lie Detectors can Induce Honesty or Evasion

Chris Cundy, Adam Gleave

NEURIPS 2025posterarXiv:2505.13787
4
citations

Preference Learning with Response Time: Robust Losses and Guarantees

Ayush Sawarni, Sahasrajit Sarmasarkar, Vasilis Syrgkanis

NEURIPS 2025oralarXiv:2505.22820
1
citations

Principled Fine-tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward

Dipendra Misra, Aldo Pacchiano, Ta-Chung Chi et al.

NEURIPS 2025posterarXiv:2601.19055

ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning

Timo Kaufmann, Yannick Metz, Daniel Keim et al.

NEURIPS 2025posterarXiv:2512.25023

RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

Tianyu Yu, Haoye Zhang, Qiming Li et al.

CVPR 2025highlightarXiv:2405.17220
58
citations

RouteLLM: Learning to Route LLMs from Preference Data

Isaac Ong, Amjad Almahairi, Vincent Wu et al.

ICLR 2025poster
24
citations

Scalable Valuation of Human Feedback through Provably Robust Model Alignment

Masahiro Fujisawa, Masaki Adachi, Michael A Osborne

NEURIPS 2025posterarXiv:2505.17859
1
citations

Self-Boosting Large Language Models with Synthetic Preference Data

Qingxiu Dong, Li Dong, Xingxing Zhang et al.

ICLR 2025posterarXiv:2410.06961
30
citations

SELF-EVOLVED REWARD LEARNING FOR LLMS

Chenghua Huang, Zhizhen Fan, Lu Wang et al.

ICLR 2025posterarXiv:2411.00418
19
citations

Self-Refining Language Model Anonymizers via Adversarial Distillation

Kyuyoung Kim, Hyunjun Jeon, Jinwoo Shin

NEURIPS 2025posterarXiv:2506.01420
3
citations

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Jiale Cheng, Xiao Liu, Cunxiang Wang et al.

ICLR 2025posterarXiv:2412.11605
12
citations

Sparta Alignment: Collectively Aligning Multiple Language Models through Combat

Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.

NEURIPS 2025posterarXiv:2506.04721
3
citations

Tulu 3: Pushing Frontiers in Open Language Model Post-Training

Nathan Lambert, Jacob Morrison, Valentina Pyatkin et al.

COLM 2025paperarXiv:2411.15124
491
citations

Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization

Noam Razin, Sadhika Malladi, Adithya Bhaskar et al.

ICLR 2025posterarXiv:2410.08847
49
citations

Variational Best-of-N Alignment

Afra Amini, Tim Vieira, Elliott Ash et al.

ICLR 2025posterarXiv:2407.06057
37
citations

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization

Jiale Cheng, Ruiliang Lyu, Xiaotao Gu et al.

ICCV 2025posterarXiv:2503.20491
13
citations

Active Preference Learning for Large Language Models

William Muldrew, Peter Hayes, Mingtian Zhang et al.

ICML 2024posterarXiv:2402.08114

Customizing Language Model Responses with Contrastive In-Context Learning

Xiang Gao, Kamalika Das

AAAI 2024paperarXiv:2401.17390
19
citations

Feel-Good Thompson Sampling for Contextual Dueling Bandits

Xuheng Li, Heyang Zhao, Quanquan Gu

ICML 2024posterarXiv:2404.06013

Improved Bandits in Many-to-One Matching Markets with Incentive Compatibility

Fang Kong, Shuai Li

AAAI 2024paperarXiv:2401.01528
10
citations

Interactive Hyperparameter Optimization in Multi-Objective Problems via Preference Learning

Joseph Giovanelli, Alexander Tornede, Tanja Tornede et al.

AAAI 2024paperarXiv:2309.03581

Model Alignment as Prospect Theoretic Optimization

Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff et al.

ICML 2024spotlightarXiv:2402.01306

Multi-Objective Bayesian Optimization with Active Preference Learning

Ryota Ozaki, Kazuki Ishikawa, Youhei Kanzaki et al.

AAAI 2024paperarXiv:2311.13460
14
citations

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Kenneth Li, Samy Jelassi, Hugh Zhang et al.

ICML 2024posterarXiv:2402.14688

RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Harrison Lee, Samrat Phatale, Hassan Mansoor et al.

ICML 2024posterarXiv:2309.00267

RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback

Yufei Wang, Zhanyi Sun, Jesse Zhang et al.

ICML 2024posterarXiv:2402.03681

Self-Rewarding Language Models

Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho et al.

ICML 2024posterarXiv:2401.10020

Transforming and Combining Rewards for Aligning Large Language Models

Zihao Wang, Chirag Nagpal, Jonathan Berant et al.

ICML 2024posterarXiv:2402.00742

ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback

Ganqu Cui, Lifan Yuan, Ning Ding et al.

ICML 2024posterarXiv:2310.01377