Spotlight "reinforcement learning" Papers
17 papers found
ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition
Daolang Huang, Xinyi Wen, Ayush Bharti et al.
AlphaZero Neural Scaling and Zipf's Law: a Tale of Board Games and Power Laws
Oren Neumann, Claudius Gros
Checklists Are Better Than Reward Models For Aligning Language Models
Vijay Viswanathan, Yanchao Sun, Xiang Kong et al.
CURE: Co-Evolving Coders and Unit Testers via Reinforcement Learning
Yinjie Wang, Ling Yang, Ye Tian et al.
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
Siyan Zhao, Devaansh Gupta, Qinqing Zheng et al.
EDELINE: Enhancing Memory in Diffusion-based World Models via Linear-Time Sequence Modeling
Jia-Hua Lee, Bor-Jiun Lin, Wei-Fang Sun et al.
LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models
Qianyue Hao, Yiwen Song, Qingmin Liao et al.
Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning
Gunshi Gupta, Karmesh Yadav, Zsolt Kira et al.
Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding
Hanyin Wang, Zhenbang Wu, Gururaj Kolar et al.
Reinforcement Learning with Imperfect Transition Predictions: A Bellman-Jensen Approach
Chenbei Lu, Zaiwei Chen, Tongxin Li et al.
Code as Reward: Empowering Reinforcement Learning with VLMs
David Venuto, Mohammad Sami Nur Islam, Martin Klissarov et al.
DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation
Yinjun Wu, Mayank Keoliya, Kan Chen et al.
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
Shengjie Wang, Shaohuai Liu, Weirui Ye et al.
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Maciej Wołczyk, Bartłomiej Cupiał, Mateusz Ostaszewski et al.
Mixtures of Experts Unlock Parameter Scaling for Deep RL
Johan Obando Ceron, Ghada Sokar, Timon Willi et al.
RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
Zelei Cheng, Xian Wu, Jiahao Yu et al.
Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space
Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung et al.