NEURIPS 2025 "markov decision processes" Papers
10 papers found
A Generalized Bisimulation Metric of State Similarity between Markov Decision Processes: From Theoretical Propositions to Applications
Zhenyu Tao, Wei Xu, Xiaohu You
NEURIPS 2025posterarXiv:2509.18714
2
citations
Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs
Mehran Shakerinava, Siamak Ravanbakhsh, Adam Oberman
NEURIPS 2025spotlightarXiv:2505.12049
Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds
Hao Liang, Zhiquan Luo
NEURIPS 2025posterarXiv:2210.14051
18
citations
Efficient Preference-Based Reinforcement Learning: Randomized Exploration meets Experimental Design
Andreas Schlaginhaufen, Reda Ouhamma, Maryam Kamgarpour
NEURIPS 2025posterarXiv:2506.09508
1
citations
Non-convex entropic mean-field optimization via Best Response flow
Razvan-Andrei Lascu, Mateusz Majka
NEURIPS 2025posterarXiv:2505.22760
1
citations
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
Jasmine Bayrooti, Sattar Vakili, Amanda Prorok et al.
NEURIPS 2025oralarXiv:2510.20725
On the Convergence of Single-Timescale Actor-Critic
Navdeep Kumar, Priyank Agrawal, Giorgia Ramponi et al.
NEURIPS 2025posterarXiv:2410.08868
1
citations
Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach
Swetha Ganesh, Vaneet Aggarwal
NEURIPS 2025posterarXiv:2505.19986
2
citations
REINFORCE Converges to Optimal Policies with Any Learning Rate
Samuel Robertson, Thang Chu, Bo Dai et al.
NEURIPS 2025poster
REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA
Rui Miao, Babak Shahbaba, Annie Qu
NEURIPS 2025posterarXiv:2505.09496
1
citations