NeurIPS Poster "offline reinforcement learning" Papers
16 papers found
Adaptable Safe Policy Learning from Multi-task Data with Constraint Prioritized Decision Transformer
Ruiqi Xue, Ziqian Zhang, Lihe Li et al.
NeurIPS 2025poster
ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning
Zeyuan Liu, Zhihe Yang, Jiawei Xu et al.
NeurIPS 2025posterarXiv:2505.23871
2
citations
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Jifeng Hu, Sili Huang, Zhejian Yang et al.
NeurIPS 2025posterarXiv:2505.01822
Finite-Time Bounds for Average-Reward Fitted Q-Iteration
Jongmin Lee, Ernest Ryu
NeurIPS 2025posterarXiv:2510.17391
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
Uladzislau Sobal, Wancong Zhang, Kyunghyun Cho et al.
NeurIPS 2025posterarXiv:2502.14819
18
citations
Learning Preferences without Interaction for Cooperative AI: A Hybrid Offline-Online Approach
Haitong Ma, Haoran Yu, Haobo Fu et al.
NeurIPS 2025poster
Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling
Nguyen Phuc, Ngoc-Hieu Nguyen, Duy M. H. Nguyen et al.
NeurIPS 2025posterarXiv:2506.08681
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Pai Liu, Lingfeng Zhao, Shivangi Agarwal et al.
NeurIPS 2025posterarXiv:2502.08021
4
citations
MOSDT: Self-Distillation-Based Decision Transformer for Multi-Agent Offline Safe Reinforcement Learning
Yuchen Xia, Yunjian Xu
NeurIPS 2025poster
Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization
Subhojyoti Mukherjee, Viet Lai, Raghavendra Addanki et al.
NeurIPS 2025posterarXiv:2506.06964
2
citations
Online Optimization for Offline Safe Reinforcement Learning
Yassine Chemingui, Aryan Deshwal, Alan Fern et al.
NeurIPS 2025posterarXiv:2510.22027
Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning
Jongchan Park, Mingyu Park, Donghwan Lee
NeurIPS 2025posterarXiv:2505.05701
1
citations
Rebalancing Return Coverage for Conditional Sequence Modeling in Offline Reinforcement Learning
Wensong Bai, Chufan Chen, Yichao Fu et al.
NeurIPS 2025poster
REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA
Rui Miao, Babak Shahbaba, Annie Qu
NeurIPS 2025posterarXiv:2505.09496
1
citations
RLZero: Direct Policy Inference from Language Without In-Domain Supervision
Harshit Sushil Sikchi, Siddhant Agarwal, Pranaya Jajoo et al.
NeurIPS 2025posterarXiv:2412.05718
3
citations
Value-Guided Decision Transformer: A Unified Reinforcement Learning Framework for Online and Offline Settings
Hongling Zheng, Li Shen, Yong Luo et al.
NeurIPS 2025poster