Poster "offline reinforcement learning" Papers
54 papers found • Page 1 of 2
$q$-exponential family for policy optimization
Lingwei Zhu, Haseeb Shah, Han Wang et al.
ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning
Zeyuan Liu, Zhihe Yang, Jiawei Xu et al.
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
Hyungkyu Kang, Min-hwan Oh
DoF: A Diffusion Factorization Framework for Offline Multi-Agent Reinforcement Learning
Chao Li, Ziwei Deng, Chenxing Lin et al.
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
Zhiyuan Zhou, Andy Peng, Qiyang Li et al.
Energy-Weighted Flow Matching for Offline Reinforcement Learning
Shiyuan Zhang, Weitong Zhang, Quanquan Gu
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
Yiqin Yang, Quanwei Wang, Chenghao Li et al.
Finite-Time Bounds for Average-Reward Fitted Q-Iteration
Jongmin Lee, Ernest Ryu
FOSP: Fine-tuning Offline Safe Policy through World Models
Chenyang Cao, Yucheng Xin, Silang Wu et al.
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
Uladzislau Sobal, Wancong Zhang, Kyunghyun Cho et al.
Model-Free Offline Reinforcement Learning with Enhanced Robustness
Chi Zhang, Zain Ulabedeen Farhat, George Atia et al.
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Pai Liu, Lingfeng Zhao, Shivangi Agarwal et al.
MOSDT: Self-Distillation-Based Decision Transformer for Multi-Agent Offline Safe Reinforcement Learning
Yuchen Xia, Yunjian Xu
Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization
Subhojyoti Mukherjee, Viet Lai, Raghavendra Addanki et al.
OGBench: Benchmarking Offline Goal-Conditioned RL
Seohong Park, Kevin Frans, Benjamin Eysenbach et al.
REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA
Rui Miao, Babak Shahbaba, Annie Qu
RLZero: Direct Policy Inference from Language Without In-Domain Supervision
Harshit Sushil Sikchi, Siddhant Agarwal, Pranaya Jajoo et al.
Value-aligned Behavior Cloning for Offline Reinforcement Learning via Bi-level Optimization
Xingyu Jiang, Ning Gao, Xiuhui Zhang et al.
Value-Guided Decision Transformer: A Unified Reinforcement Learning Framework for Online and Offline Settings
Hongling Zheng, Li Shen, Yong Luo et al.
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
Tenglong Liu, Yang Li, Yixing Lan et al.
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Linear MDPs
Kihyuk Hong, Ambuj Tewari
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories
Qianlan Yang, Yu-Xiong Wang
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Hao Hu, yiqin yang, Jianing Ye et al.
Causal Action Influence Aware Counterfactual Data Augmentation
Núria Armengol Urpí, Marco Bagatella, Marin Vlastelica et al.
Confidence Aware Inverse Constrained Reinforcement Learning
Sriram Ganapathi Subramanian, Guiliang Liu, Mohammed Elmahgiubi et al.
Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning
Xiaoyu Wen, Chenjia Bai, Kang Xu et al.
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics
Xinyu Zhang, Wenjie Qiu, Yi-Chen Li et al.
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
Guanghe Li, Yixiang Shan, Zhengbang Zhu et al.
Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning
Takayuki Osa, Tatsuya Harada
Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design
Shuze Liu, Shangtong Zhang
Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL
Fangwei Zhong, Kui Wu, Hai Ci et al.
Enhancing Value Function Estimation through First-Order State-Action Dynamics in Offline Reinforcement Learning
Yun-Hsuan Lien, Ping-Chun Hsieh, Tzu-Mao Li et al.
Exploration and Anti-Exploration with Distributional Random Network Distillation
Kai Yang, jian tao, Jiafei Lyu et al.
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Jiin Woo, Laixi Shi, Gauri Joshi et al.
HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning
Shengchao Hu, Ziqing Fan, Li Shen et al.
Improving Generalization in Offline Reinforcement Learning via Adversarial Data Splitting
Da Wang, Lin Li, Wei Wei et al.
In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought
sili huang, Jifeng Hu, Hechang Chen et al.
Inferring the Long-Term Causal Effects of Long-Term Treatments from Short-Term Experiments
Allen Tran, Aurelien Bibaut, Nathan Kallus
Information-Directed Pessimism for Offline Reinforcement Learning
Alec Koppel, Sujay Bhatt, Jiacheng Guo et al.
Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective
Lei Zhao, Mengdi Wang, Yu Bai
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Michael Psenka, Alejandro Escontrela, Pieter Abbeel et al.
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
Heewoong Choi, Sangwon Jung, Hongjoon Ahn et al.
Model-based Reinforcement Learning for Confounded POMDPs
Mao Hong, Zhengling Qi, Yanxun Xu
Model-Free Robust $\phi$-Divergence Reinforcement Learning Using Both Offline and Online Data
Kishan Panaganti, Adam Wierman, Eric Mazumdar
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
Kaiwen Wang, Owen Oertell, Alekh Agarwal et al.
Offline Transition Modeling via Contrastive Energy Learning
Ruifeng Chen, Chengxing Jia, Zefang Huang et al.
PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer
Chang Chen, Junyeob Baek, Fei Deng et al.
Q-value Regularized Transformer for Offline Reinforcement Learning
Shengchao Hu, Ziqing Fan, Chaoqin Huang et al.
ReDiffuser: Reliable Decision-Making Using a Diffuser with Confidence Estimation
Nantian He, Shaohui Li, Zhi Li et al.
Reinformer: Max-Return Sequence Modeling for Offline RL
Zifeng Zhuang, Dengyun Peng, Jinxin Liu et al.