2025 Poster "off-policy reinforcement learning" Papers
6 papers found
Actor-Free Continuous Control via Structurally Maximizable Q-Functions
Yigit Korkmaz, Urvi Bhuwania, Ayush Jain et al.
NEURIPS 2025posterarXiv:2510.18828
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux et al.
ICLR 2025posterarXiv:2410.18252
39
citations
Off-policy Reinforcement Learning with Model-based Exploration Augmentation
Likun Wang, Xiangteng Zhang, Yinuo Wang et al.
NEURIPS 2025posterarXiv:2510.25529
Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization
Daniel Palenicek, Florian Vogt, Joe Watson et al.
NEURIPS 2025posterarXiv:2502.07523
8
citations
Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control
Georgios Papoudakis, Thomas Coste, Jianye Hao et al.
NEURIPS 2025posterarXiv:2509.01720
Zero-shot Model-based Reinforcement Learning using Large Language Models
Abdelhakim Benechehab, Youssef Attia El Hili, Ambroise Odonnat et al.
ICLR 2025posterarXiv:2410.11711