The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise

13citations
arXiv:2401.07844
13
Citations
#303
in NeurIPS 2025
of 5858 papers
3
Authors
4
Data Points

Abstract

Stochastic approximation is a class of algorithms that update a vector iteratively, incrementally, and stochastically, including, e.g., stochastic gradient descent and temporal difference learning. One fundamental challenge in analyzing a stochastic approximation algorithm is to establish its stability, i.e., to show that the stochastic vector iterates are bounded almost surely. In this paper, we extend the celebrated Borkar-Meyn theorem for stability from the Martingale difference noise setting to the Markovian noise setting, which greatly improves its applicability in reinforcement learning, especially in those off-policy reinforcement learning algorithms with linear function approximation and eligibility traces. Central to our analysis is the diminishing asymptotic rate of change of a few functions, which is implied by both a form of the strong law of large numbers and a form of the law of the iterated logarithm.

Citation History

Jan 26, 2026
0
Jan 27, 2026
0
Jan 27, 2026
0
Feb 2, 2026
13+13