"offline policy optimization" Papers
3 papers found
Offline Actor-Critic for Average Reward MDPs
William Powell, Jeongyeol Kwon, Qiaomin Xie et al.
NeurIPS 2025poster
73
citations
ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning
Chen-Xiao Gao, Chenyang Wu, Mingjun Cao et al.
AAAI 2024paperarXiv:2309.05915
25
citations
Policy-conditioned Environment Models are More Generalizable
Ruifeng Chen, Xiong-Hui Chen, Yihao Sun et al.
ICML 2024poster