2024 "human feedback" Papers
3 papers found
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback
Kihyun Kim, Jiawei Zhang, Asuman Ozdaglar et al.
ICML 2024posterarXiv:2405.12421
Efficient Exploration for LLMs
Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao et al.
ICML 2024posterarXiv:2402.00396
Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff et al.
ICML 2024spotlightarXiv:2402.01306