"stochastic environments" Papers
3 papers found
PlanU: Large Language Model Reasoning through Planning under Uncertainty
Ziwei Deng, Mian Deng, Chenjing Liang et al.
NeurIPS 2025posterarXiv:2510.18442
Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays
Qingyuan Wu, Simon Zhan, Yixuan Wang et al.
ICML 2024poster
To the Max: Reinventing Reward in Reinforcement Learning
Grigorii Veviurko, Wendelin Boehmer, Mathijs de Weerdt
ICML 2024poster