2024 Poster "reward modeling" Papers
5 papers found
Efficient Exploration for LLMs
Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao et al.
ICML 2024poster
HarmonyDream: Task Harmonization Inside World Models
Haoyu Ma, Jialong Wu, Ningya Feng et al.
ICML 2024poster
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation
JoonHo Lee, Jae Oh Woo, Juree Seok et al.
ICML 2024poster
Stealthy Imitation: Reward-guided Environment-free Policy Stealing
Zhixiong Zhuang, Irina Nicolae, Mario Fritz
ICML 2024poster
Token-level Direct Preference Optimization
Yongcheng Zeng, Guoqing Liu, Weiyu Ma et al.
ICML 2024poster