2025 Poster "off-policy learning" Papers
6 papers found
Bootstrap Off-policy with World Model
Guojian Zhan, Likun Wang, Xiangteng Zhang et al.
NeurIPS 2025posterarXiv:2511.00423
1
citations
MultiScale Contextual Bandits for Long Term Objectives
Richa Rastogi, Yuta Saito, Thorsten Joachims
NeurIPS 2025posterarXiv:2503.17674
SEMDICE: Off-policy State Entropy Maximization via Stationary Distribution Correction Estimation
Jongmin Lee, Meiqi Sun, Pieter Abbeel
ICLR 2025posterarXiv:2512.10042
ShiQ: Bringing back Bellman to LLMs
Pierre Clavier, Nathan Grinsztajn, Raphaël Avalos et al.
NeurIPS 2025posterarXiv:2505.11081
1
citations
Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound
Tal Fiskus, Uri Shaham
NeurIPS 2025posterarXiv:2507.11269
Value Improved Actor Critic Algorithms
Yaniv Oren, Moritz Zanger, Pascal van der Vaart et al.
NeurIPS 2025posterarXiv:2406.01423