2025 Poster "value function estimation" Papers
4 papers found
Accelerating RL for LLM Reasoning with Optimal Advantage Regression
Kianté Brantley, Mingyu Chen, Zhaolin Gao et al.
NeurIPS 2025posterarXiv:2505.20686
12
citations
Bootstrapped Model Predictive Control
Yuhang Wang, Hanwei Guo, Sizhe Wang et al.
ICLR 2025posterarXiv:2503.18871
5
citations
In-Context Fully Decentralized Cooperative Multi-Agent Reinforcement Learning
Chao Li, Bingkun BAO, Yang Gao
NeurIPS 2025poster
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Pai Liu, Lingfeng Zhao, Shivangi Agarwal et al.
NeurIPS 2025posterarXiv:2502.08021
4
citations