NEURIPS 2025 "markov decision processes" Papers

10 papers found

A Generalized Bisimulation Metric of State Similarity between Markov Decision Processes: From Theoretical Propositions to Applications

Zhenyu Tao, Wei Xu, Xiaohu You

NEURIPS 2025posterarXiv:2509.18714
2
citations

Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs

Mehran Shakerinava, Siamak Ravanbakhsh, Adam Oberman

NEURIPS 2025spotlightarXiv:2505.12049

Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds

Hao Liang, Zhiquan Luo

NEURIPS 2025posterarXiv:2210.14051
18
citations

Efficient Preference-Based Reinforcement Learning: Randomized Exploration meets Experimental Design

Andreas Schlaginhaufen, Reda Ouhamma, Maryam Kamgarpour

NEURIPS 2025posterarXiv:2506.09508
1
citations

Non-convex entropic mean-field optimization via Best Response flow

Razvan-Andrei Lascu, Mateusz Majka

NEURIPS 2025posterarXiv:2505.22760
1
citations

No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes

Jasmine Bayrooti, Sattar Vakili, Amanda Prorok et al.

NEURIPS 2025oralarXiv:2510.20725

On the Convergence of Single-Timescale Actor-Critic

Navdeep Kumar, Priyank Agrawal, Giorgia Ramponi et al.

NEURIPS 2025posterarXiv:2410.08868
1
citations

Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach

Swetha Ganesh, Vaneet Aggarwal

NEURIPS 2025posterarXiv:2505.19986
2
citations

REINFORCE Converges to Optimal Policies with Any Learning Rate

Samuel Robertson, Thang Chu, Bo Dai et al.

NEURIPS 2025poster

REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA

Rui Miao, Babak Shahbaba, Annie Qu

NEURIPS 2025posterarXiv:2505.09496
1
citations