"markov decision processes" Papers
23 papers found
Actions Speak Louder Than Words: Rate-Reward Trade-off in Markov Decision Processes
Haotian Wu, Gongpu Chen, Deniz Gunduz
A Generalized Bisimulation Metric of State Similarity between Markov Decision Processes: From Theoretical Propositions to Applications
Zhenyu Tao, Wei Xu, Xiaohu You
Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs
Mehran Shakerinava, Siamak Ravanbakhsh, Adam Oberman
Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds
Hao Liang, Zhiquan Luo
CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models
Shengzhuang Chen, Yikai Liao, Xiaoxiao Sun et al.
Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics
Runzhe Wu, Ayush Sekhari, Akshay Krishnamurthy et al.
Efficient Preference-Based Reinforcement Learning: Randomized Exploration meets Experimental Design
Andreas Schlaginhaufen, Reda Ouhamma, Maryam Kamgarpour
Non-convex entropic mean-field optimization via Best Response flow
Razvan-Andrei Lascu, Mateusz Majka
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
Jasmine Bayrooti, Sattar Vakili, Amanda Prorok et al.
On the Convergence of Single-Timescale Actor-Critic
Navdeep Kumar, Priyank Agrawal, Giorgia Ramponi et al.
REINFORCE Converges to Optimal Policies with Any Learning Rate
Samuel Robertson, Thang Chu, Bo Dai et al.
REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA
Rui Miao, Babak Shahbaba, Annie Qu
SEMDICE: Off-policy State Entropy Maximization via Stationary Distribution Correction Estimation
Jongmin Lee, Meiqi Sun, Pieter Abbeel
AI Alignment with Changing and Influenceable Reward Functions
Micah Carroll, Davis Foote, Anand Siththaranjan et al.
Efficient Exploration in Average-Reward Constrained Reinforcement Learning: Achieving Near-Optimal Regret With Posterior Sampling
Danil Provodin, Maurits Kaptein, Mykola Pechenizkiy
Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction
Riccardo De Santi, Federico Arangath Joseph, Noah Liniger et al.
Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective
Lei Zhao, Mengdi Wang, Yu Bai
Model-Free Robust $\phi$-Divergence Reinforcement Learning Using Both Offline and Online Data
Kishan Panaganti, Adam Wierman, Eric Mazumdar
On The Statistical Complexity of Offline Decision-Making
Thanh Nguyen-Tang, Raman Arora
Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes
David Klaska, Antonin Kucera, Vojtěch Kůr et al.
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP
Subhojyoti Mukherjee, Josiah Hanna, Robert Nowak
Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
Fengdi Che, Chenjun Xiao, Jincheng Mei et al.
Test-Time Regret Minimization in Meta Reinforcement Learning
Mirco Mutti, Aviv Tamar