2025 "offline reinforcement learning" Papers
22 papers found
$q$-exponential family for policy optimization
Lingwei Zhu, Haseeb Shah, Han Wang et al.
ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning
Zeyuan Liu, Zhihe Yang, Jiawei Xu et al.
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
Hyungkyu Kang, Min-hwan Oh
DoF: A Diffusion Factorization Framework for Offline Multi-Agent Reinforcement Learning
Chao Li, Ziwei Deng, Chenxing Lin et al.
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
Zhiyuan Zhou, Andy Peng, Qiyang Li et al.
Energy-Weighted Flow Matching for Offline Reinforcement Learning
Shiyuan Zhang, Weitong Zhang, Quanquan Gu
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
Yiqin Yang, Quanwei Wang, Chenghao Li et al.
Finite-Time Bounds for Average-Reward Fitted Q-Iteration
Jongmin Lee, Ernest Ryu
Forecasting in Offline Reinforcement Learning for Non-stationary Environments
Suzan Ece Ada, Georg Martius, Emre Ugur et al.
FOSP: Fine-tuning Offline Safe Policy through World Models
Chenyang Cao, Yucheng Xin, Silang Wu et al.
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
Uladzislau Sobal, Wancong Zhang, Kyunghyun Cho et al.
Model-Free Offline Reinforcement Learning with Enhanced Robustness
Chi Zhang, Zain Ulabedeen Farhat, George Atia et al.
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Pai Liu, Lingfeng Zhao, Shivangi Agarwal et al.
MOSDT: Self-Distillation-Based Decision Transformer for Multi-Agent Offline Safe Reinforcement Learning
Yuchen Xia, Yunjian Xu
Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization
Subhojyoti Mukherjee, Viet Lai, Raghavendra Addanki et al.
OGBench: Benchmarking Offline Goal-Conditioned RL
Seohong Park, Kevin Frans, Benjamin Eysenbach et al.
REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA
Rui Miao, Babak Shahbaba, Annie Qu
RLZero: Direct Policy Inference from Language Without In-Domain Supervision
Harshit Sushil Sikchi, Siddhant Agarwal, Pranaya Jajoo et al.
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Jie Cheng, Ruixi Qiao, ma yingwei et al.
Semantic Temporal Abstraction via Vision-Language Model Guidance for Efficient Reinforcement Learning
Tian-Shuo Liu, Xu-Hui Liu, Ruifeng Chen et al.
Value-aligned Behavior Cloning for Offline Reinforcement Learning via Bi-level Optimization
Xingyu Jiang, Ning Gao, Xiuhui Zhang et al.
Value-Guided Decision Transformer: A Unified Reinforcement Learning Framework for Online and Offline Settings
Hongling Zheng, Li Shen, Yong Luo et al.