pith. sign in

Lyapunov-Certified Direct Switching Theory for Q-Learning

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it
abstract

Q-learning is a fundamental algorithmic primitive in reinforcement learning. This paper develops a new framework for analyzing Q-learning from a switching linear system (SLS) viewpoint. In particular, we derive a stochastic SLS representation of the Q-learning error, and a finite-time error analysis through the joint spectral radius (JSR) of the corresponding SLS model, where the JSR is the exact worst-case exponential rate of the associated SLS. To the best of our knowledge, this is the first convergence rate analysis of standard Q-learning whose leading exponential rate is expressed through the JSR. The resulting rate is tied to the intrinsic worst-case exponential rate of the direct SLS representation and can be sharper than row-sum upper bounds when those bounds are conservative.

citation-role summary

method 1

citation-polarity summary

years

2026 6

verdicts

UNVERDICTED 6

roles

method 1

polarities

use method 1

representative citing papers

Heavy-Ball Q-Learning with Residual Weighting Correction

cs.LG · 2026-06-25 · unverdicted · novelty 7.0

Corrected heavy-ball Q-learning with convergence and acceleration guarantees is derived via switched linear system and joint spectral radius analysis, extended to linear function approximation.

Sign-Separated Finite-Time Error Analysis of Q-Learning

cs.AI · 2026-05-15 · unverdicted · novelty 7.0

Sign-separated analysis decomposes Q-learning errors into negative parts dominated by an optimal-policy LTI system and positive parts controlled by a switching system, yielding finite-time bounds for deterministic and stochastic cases.

Switching-Geometry Analysis of Deflated Q-Value Iteration

math.OC · 2026-05-11 · unverdicted · novelty 7.0 · 2 refs

Deflated Q-value iteration admits a projected switching-system model whose joint spectral radius can be strictly smaller than the discount factor, yielding a sharper convergence characterization while leaving the greedy policy sequence unchanged.

A Switching System Theory of Q-Learning with Linear Function Approximation

cs.LG · 2026-05-10 · unverdicted · novelty 7.0 · 2 refs

Derives an exact linear switched model for the mean dynamics of Q-learning with linear function approximation and relates convergence to joint spectral radius stability of the switched system, extending the view to stochastic and regularized cases.

citing papers explorer

Showing 6 of 6 citing papers.

  • Heavy-Ball Q-Learning with Residual Weighting Correction cs.LG · 2026-06-25 · unverdicted · none · ref 22 · internal anchor

    Corrected heavy-ball Q-learning with convergence and acceleration guarantees is derived via switched linear system and joint spectral radius analysis, extended to linear function approximation.

  • Target Updates May Stabilize Linear Q-Learning: Periodic and Soft Dynamics stat.ML · 2026-05-31 · unverdicted · none · ref 13 · internal anchor

    Periodic and soft target updates guarantee convergence in linear Q-learning to the exact projected Q-Bellman solution under spectral and step-size conditions via joint spectral radius analysis of switched linear systems.

  • Sign-Separated Finite-Time Error Analysis of Q-Learning cs.AI · 2026-05-15 · unverdicted · none · ref 12 · internal anchor

    Sign-separated analysis decomposes Q-learning errors into negative parts dominated by an optimal-policy LTI system and positive parts controlled by a switching system, yielding finite-time bounds for deterministic and stochastic cases.

  • Switching-Geometry Analysis of Deflated Q-Value Iteration math.OC · 2026-05-11 · unverdicted · none · ref 11 · 2 links · internal anchor

    Deflated Q-value iteration admits a projected switching-system model whose joint spectral radius can be strictly smaller than the discount factor, yielding a sharper convergence characterization while leaving the greedy policy sequence unchanged.

  • A Switching System Theory of Q-Learning with Linear Function Approximation cs.LG · 2026-05-10 · unverdicted · none · ref 15 · 2 links · internal anchor

    Derives an exact linear switched model for the mean dynamics of Q-learning with linear function approximation and relates convergence to joint spectral radius stability of the switched system, extending the view to stochastic and regularized cases.

  • Geometrically Averaged Hard Target Updates for Linear Q-Learning cs.LG · 2026-06-09 · unverdicted · none · ref 12 · internal anchor

    Introduces and analyzes the λ-target update for linear Q-learning via geometric averaging of periodic target maps, studied with a switching-system model in the deterministic case.