Spotlight "reinforcement learning" Papers

17 papers found

ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition

Daolang Huang, Xinyi Wen, Ayush Bharti et al.

NeurIPS 2025spotlightarXiv:2506.07259
2
citations

AlphaZero Neural Scaling and Zipf's Law: a Tale of Board Games and Power Laws

Oren Neumann, Claudius Gros

NeurIPS 2025spotlightarXiv:2412.11979
9
citations

Checklists Are Better Than Reward Models For Aligning Language Models

Vijay Viswanathan, Yanchao Sun, Xiang Kong et al.

NeurIPS 2025spotlightarXiv:2507.18624
23
citations

CURE: Co-Evolving Coders and Unit Testers via Reinforcement Learning

Yinjie Wang, Ling Yang, Ye Tian et al.

NeurIPS 2025spotlight

d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning

Siyan Zhao, Devaansh Gupta, Qinqing Zheng et al.

NeurIPS 2025spotlightarXiv:2504.12216
75
citations

EDELINE: Enhancing Memory in Diffusion-based World Models via Linear-Time Sequence Modeling

Jia-Hua Lee, Bor-Jiun Lin, Wei-Fang Sun et al.

NeurIPS 2025spotlightarXiv:2502.00466
2
citations

LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models

Qianyue Hao, Yiwen Song, Qingmin Liao et al.

NeurIPS 2025spotlightarXiv:2505.15293
3
citations

Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning

Gunshi Gupta, Karmesh Yadav, Zsolt Kira et al.

NeurIPS 2025spotlightarXiv:2510.19732

Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding

Hanyin Wang, Zhenbang Wu, Gururaj Kolar et al.

NeurIPS 2025spotlightarXiv:2505.21908
3
citations

Reinforcement Learning with Imperfect Transition Predictions: A Bellman-Jensen Approach

Chenbei Lu, Zaiwei Chen, Tongxin Li et al.

NeurIPS 2025spotlightarXiv:2510.18687
1
citations

Code as Reward: Empowering Reinforcement Learning with VLMs

David Venuto, Mohammad Sami Nur Islam, Martin Klissarov et al.

ICML 2024spotlight

DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation

Yinjun Wu, Mayank Keoliya, Kan Chen et al.

ICML 2024spotlight

EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data

Shengjie Wang, Shaohuai Liu, Weirui Ye et al.

ICML 2024spotlight

Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem

Maciej Wołczyk, Bartłomiej Cupiał, Mateusz Ostaszewski et al.

ICML 2024spotlight

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Johan Obando Ceron, Ghada Sokar, Timon Willi et al.

ICML 2024spotlight

RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation

Zelei Cheng, Xian Wu, Jiahao Yu et al.

ICML 2024spotlight

Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space

Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung et al.

ICML 2024spotlight