"preference learning" Papers
50 papers found
Conference
Advancing LLM Reasoning Generalists with Preference Trees
Lifan Yuan, Ganqu Cui, Hanbin Wang et al.
Align-DA: Align Score-based Atmospheric Data Assimilation with Multiple Preferences
Jing-An Sun, Hang Fan, Junchao Gong et al.
Aligning Text-to-Image Diffusion Models to Human Preference by Classification
Longquan Dai, Xiaolu Wei, wang he et al.
Bandit Learning in Matching Markets with Indifference
Fang Kong, Jingqi Tang, Mingzhu Li et al.
Bayesian Optimization with Preference Exploration using a Monotonic Neural Network Ensemble
Hanyang Wang, Juergen Branke, Matthias Poloczek
DeepHalo: A Neural Choice Model with Controllable Context Effects
Shuhan Zhang, Zhi Wang, Rui Gao et al.
Deep Value Benchmark: Measuring Whether Models Generalize Deep values or Shallow Preferences
Joshua Ashkinaze, Hua Shen, Saipranav Avula et al.
Diverse Preference Learning for Capabilities and Alignment
Stewart Slocum, Asher Parker-Sartori, Dylan Hadfield-Menell
DSPO: Direct Score Preference Optimization for Diffusion Model Alignment
Huaisheng Zhu, Teng Xiao, Vasant Honavar
Generative Adversarial Ranking Nets
Yinghua Yao, Yuangang Pan, Jing Li et al.
Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections
Bo Wang, Qinyuan Cheng, Runyu Peng et al.
Improving Large Vision and Language Models by Learning from a Panel of Peers
Jefferson Hernandez, Jing Shi, Simon Jenni et al.
Learning Preferences without Interaction for Cooperative AI: A Hybrid Offline-Online Approach
Haitong Ma, Haoran Yu, Haobo Fu et al.
LLaVA-Critic: Learning to Evaluate Multimodal Models
Tianyi Xiong, Xiyao Wang, Dong Guo et al.
LoRe: Personalizing LLMs via Low-Rank Reward Modeling
Avinandan Bose, Zhihan Xiong, Yuejie Chi et al.
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents
Junpeng Yue, Xinrun Xu, Börje F. Karlsson et al.
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.
NaDRO: Leveraging Dual-Reward Strategies for LLMs Training on Noisy Data
Haolong Qian, Xianliang Yang, Ling Zhang et al.
Online-to-Offline RL for Agent Alignment
Xu Liu, Haobo Fu, Stefano V. Albrecht et al.
Predictive Preference Learning from Human Interventions
Haoyuan Cai, Zhenghao (Mark) Peng, Bolei Zhou
Preference Learning with Lie Detectors can Induce Honesty or Evasion
Chris Cundy, Adam Gleave
Preference Learning with Response Time: Robust Losses and Guarantees
Ayush Sawarni, Sahasrajit Sarmasarkar, Vasilis Syrgkanis
Principled Fine-tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward
Dipendra Misra, Aldo Pacchiano, Ta-Chung Chi et al.
ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning
Timo Kaufmann, Yannick Metz, Daniel Keim et al.
RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
Tianyu Yu, Haoye Zhang, Qiming Li et al.
RouteLLM: Learning to Route LLMs from Preference Data
Isaac Ong, Amjad Almahairi, Vincent Wu et al.
Scalable Valuation of Human Feedback through Provably Robust Model Alignment
Masahiro Fujisawa, Masaki Adachi, Michael A Osborne
Self-Boosting Large Language Models with Synthetic Preference Data
Qingxiu Dong, Li Dong, Xingxing Zhang et al.
SELF-EVOLVED REWARD LEARNING FOR LLMS
Chenghua Huang, Zhizhen Fan, Lu Wang et al.
Self-Refining Language Model Anonymizers via Adversarial Distillation
Kyuyoung Kim, Hyunjun Jeon, Jinwoo Shin
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Jiale Cheng, Xiao Liu, Cunxiang Wang et al.
Sparta Alignment: Collectively Aligning Multiple Language Models through Combat
Yuru Jiang, Wenxuan Ding, Shangbin Feng et al.
Tulu 3: Pushing Frontiers in Open Language Model Post-Training
Nathan Lambert, Jacob Morrison, Valentina Pyatkin et al.
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin, Sadhika Malladi, Adithya Bhaskar et al.
Variational Best-of-N Alignment
Afra Amini, Tim Vieira, Elliott Ash et al.
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
Jiale Cheng, Ruiliang Lyu, Xiaotao Gu et al.
Active Preference Learning for Large Language Models
William Muldrew, Peter Hayes, Mingtian Zhang et al.
Customizing Language Model Responses with Contrastive In-Context Learning
Xiang Gao, Kamalika Das
Feel-Good Thompson Sampling for Contextual Dueling Bandits
Xuheng Li, Heyang Zhao, Quanquan Gu
Improved Bandits in Many-to-One Matching Markets with Incentive Compatibility
Fang Kong, Shuai Li
Interactive Hyperparameter Optimization in Multi-Objective Problems via Preference Learning
Joseph Giovanelli, Alexander Tornede, Tanja Tornede et al.
Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff et al.
Multi-Objective Bayesian Optimization with Active Preference Learning
Ryota Ozaki, Kazuki Ishikawa, Youhei Kanzaki et al.
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
Kenneth Li, Samy Jelassi, Hugh Zhang et al.
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Harrison Lee, Samrat Phatale, Hassan Mansoor et al.
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback
Yufei Wang, Zhanyi Sun, Jesse Zhang et al.
Self-Rewarding Language Models
Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho et al.
Transforming and Combining Rewards for Aligning Large Language Models
Zihao Wang, Chirag Nagpal, Jonathan Berant et al.
ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback
Ganqu Cui, Lifan Yuan, Ning Ding et al.