ICLR Poster "direct preference optimization" Papers
4 papers found
DSPO: Direct Score Preference Optimization for Diffusion Model Alignment
Huaisheng Zhu, Teng Xiao, Vasant Honavar
ICLR 2025poster
22
citations
Measuring memorization in RLHF for code completion
Jamie Hayes, I Shumailov, Billy Porter et al.
ICLR 2025posterarXiv:2406.11715
10
citations
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.
ICLR 2025posterarXiv:2410.17637
19
citations
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin, Sadhika Malladi, Adithya Bhaskar et al.
ICLR 2025posterarXiv:2410.08847
47
citations