ICML Poster "model alignment" Papers
3 papers found
Active Preference Learning for Large Language Models
William Muldrew, Peter Hayes, Mingtian Zhang et al.
ICML 2024poster
Recovering the Pre-Fine-Tuning Weights of Generative Models
Eliahu Horwitz, Jonathan Kahana, Yedid Hoshen
ICML 2024poster
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Collin Burns, Pavel Izmailov, Jan Kirchner et al.
ICML 2024poster