"off-policy reinforcement learning" Papers
7 papers found
Off-policy Reinforcement Learning with Model-based Exploration Augmentation
Likun Wang, Xiangteng Zhang, Yinuo Wang et al.
NeurIPS 2025posterarXiv:2510.25529
Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control
Georgios Papoudakis, Thomas Coste, Jianye Hao et al.
NeurIPS 2025posterarXiv:2509.01720
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Michael Psenka, Alejandro Escontrela, Pieter Abbeel et al.
ICML 2024poster
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
Yu Luo, Tianying Ji, Fuchun Sun et al.
ICML 2024poster
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Michal Nauman, Michał Bortkiewicz, Piotr Milos et al.
ICML 2024poster
RVI-SAC: Average Reward Off-Policy Deep Reinforcement Learning
Yukinari Hisaki, Isao Ono
ICML 2024poster
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
Tianying Ji, Yu Luo, Fuchun Sun et al.
ICML 2024poster