"tabular markov decision processes" Papers
3 papers found
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng, Haochen Zhang, Lingzhou Xue
ICLR 2025posterarXiv:2410.07574
9
citations
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Jiin Woo, Laixi Shi, Gauri Joshi et al.
ICML 2024poster
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
Han Zhong, Jiachen Hu, Yecheng Xue et al.
ICML 2024poster