ICML 2024 "human feedback alignment" Papers
3 papers found
Improving Prototypical Visual Explanations with Reward Reweighing, Reselection, and Retraining
Aaron Li, Robin Netzorg, Zhihan Cheng et al.
ICML 2024poster
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Banghua Zhu, Michael Jordan, Jiantao Jiao
ICML 2024poster
ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback
Ganqu Cui, Lifan Yuan, Ning Ding et al.
ICML 2024poster