NeurIPS "human preference alignment" Papers
7 papers found
A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models
Joshua Tian Jin Tee, Hee Suk Yoon, Abu Hanif Muhammad Syarubany et al.
NeurIPS 2025oral
Aligning Text-to-Image Diffusion Models to Human Preference by Classification
Longquan Dai, Xiaolu Wei, wang he et al.
NeurIPS 2025spotlight
Beyond the Surface: Enhancing LLM-as-a-Judge Alignment with Human via Internal Representations
Peng Lai, Jianjie Zheng, Sijie Cheng et al.
NeurIPS 2025posterarXiv:2508.03550
2
citations
Direct Alignment with Heterogeneous Preferences
Ali Shirali, Arash Nasr-Esfahany, Abdullah Alomar et al.
NeurIPS 2025posterarXiv:2502.16320
8
citations
Inference-Time Reward Hacking in Large Language Models
Hadi Khalaf, Claudio Mayrink Verdun, Alex Oesterling et al.
NeurIPS 2025spotlightarXiv:2506.19248
2
citations
Risk-aware Direct Preference Optimization under Nested Risk Measure
Lijun Zhang, Lin Li, Yajie Qi et al.
NeurIPS 2025posterarXiv:2505.20359
1
citations
WorldModelBench: Judging Video Generation Models As World Models
Dacheng Li, Yunhao Fang, Yukang Chen et al.
NeurIPS 2025posterarXiv:2502.20694
31
citations