Relearn LQR combines recursive least squares with policy gradient for on-policy data-driven LQR and proves stability of the full scheme via Lyapunov analysis with averaging and timescale separation.
A tour of reinforcement learning: The view from continuous control
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
citing papers explorer
-
Stability-Certified On-Policy Data-Driven LQR via Recursive Learning and Policy Gradient
Relearn LQR combines recursive least squares with policy gradient for on-policy data-driven LQR and proves stability of the full scheme via Lyapunov analysis with averaging and timescale separation.
- Harnessing Embodied Agents: Runtime Governance for Policy-Constrained Execution