Derived rates of order up to n^{-1/6} log^4(n S A) for the high-dimensional CLT of averaged asynchronous Q-learning iterates, plus a general martingale-difference CLT.
Nonasymptotic clt and error bounds for two-time-scale stochastic approximation.arXiv preprint arXiv:2502.09884
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Establishes non-asymptotic and functional central limit theorems for asynchronous averaged Q-learning with explicit rates depending on iterations, state-action space, discount factor, and exploration quality.
citing papers explorer
-
Gaussian Approximation for Asynchronous Q-learning
Derived rates of order up to n^{-1/6} log^4(n S A) for the high-dimensional CLT of averaged asynchronous Q-learning iterates, plus a general martingale-difference CLT.
-
Central Limit Theorems for Asynchronous Averaged Q-Learning
Establishes non-asymptotic and functional central limit theorems for asynchronous averaged Q-learning with explicit rates depending on iterations, state-action space, discount factor, and exploration quality.