2025 Poster "reward modeling" Papers
19 papers found
Advancing LLM Reasoning Generalists with Preference Trees
Lifan Yuan, Ganqu Cui, Hanbin Wang et al.
Curriculum Direct Preference Optimization for Diffusion and Consistency Models
Florinel Croitoru, Vlad Hondru, Radu Tudor Ionescu et al.
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Hyojin Bahng, Caroline Chan, Fredo Durand et al.
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO
Chengzhuo Tong, Ziyu Guo, Renrui Zhang et al.
Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing
Yisong Xiao, Aishan Liu, Siyuan Liang et al.
HelpSteer2-Preference: Complementing Ratings with Preferences
Zhilin Wang, Alexander Bukharin, Olivier Delalleau et al.
Measuring memorization in RLHF for code completion
Jamie Hayes, I Shumailov, Billy Porter et al.
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.
Online-to-Offline RL for Agent Alignment
Xu Liu, Haobo Fu, Stefano V. Albrecht et al.
PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment
Daiwei Chen, Yi Chen, Aniket Rege et al.
Principled Fine-tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward
Dipendra Misra, Aldo Pacchiano, Ta-Chung Chi et al.
Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions
Simon Matrenok, Skander Moalla, Caglar Gulcehre
Rethinking Reward Modeling in Preference-based Large Language Model Alignment
Hao Sun, Yunyi Shen, Jean-Francois Ton
Reward Learning from Multiple Feedback Types
Yannick Metz, Andras Geiszl, Raphaël Baur et al.
Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models
Ángela López-Cardona, Carlos Segura, Alexandros Karatzoglou et al.
Selftok-Zero: Reinforcement Learning for Visual Generation via Discrete and Autoregressive Visual Tokens
Bohan Wang, Mingze Zhou, Zhongqi Yue et al.
Uncertainty and Influence aware Reward Model Refinement for Reinforcement Learning from Human Feedback
Zexu Sun, Yiju Guo, Yankai Lin et al.
Variational Best-of-N Alignment
Afra Amini, Tim Vieira, Elliott Ash et al.
VICtoR: Learning Hierarchical Vision-Instruction Correlation Rewards for Long-horizon Manipulation
Kuo-Han Hung, Pang-Chi Lo, Jia-Fong Yeh et al.