Poster "policy improvement" Papers
4 papers found
When Can Model-Free Reinforcement Learning be Enough for Thinking?
Josiah Hanna, Nicholas Corrado
NeurIPS 2025posterarXiv:2506.17124
Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays
Qingyuan Wu, Simon Zhan, Yixuan Wang et al.
ICML 2024poster
ODIN: Disentangled Reward Mitigates Hacking in RLHF
Lichang Chen, Chen Zhu, Jiuhai Chen et al.
ICML 2024poster
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills
Kolby Nottingham, Bodhisattwa Prasad Majumder, Bhavana Dalvi et al.
ICML 2024poster