A gradient flow on a continuous-time Bellman error parametrized by feedback gain converges to the optimal LQR controller and stays inside the stabilizing region.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
A convex data-driven formulation yields the optimal LQI feedback gain for continuous-time systems directly from measured data without system matrices.
The authors adapt closed-loop and IRL parameterizations to continuous time, deriving policy iteration schemes, a data-driven CARE, convex reformulations, and a policy gradient flow while unifying the two approaches.
citing papers explorer
-
Bridging Continuous-time LQR and Reinforcement Learning via Gradient Flow of the Bellman Error
A gradient flow on a continuous-time Bellman error parametrized by feedback gain converges to the optimal LQR controller and stays inside the stabilizing region.
-
Data-driven Linear Quadratic Integral Control: A Convex Formulation and Policy Gradient Approach
A convex data-driven formulation yields the optimal LQI feedback gain for continuous-time systems directly from measured data without system matrices.
-
Data-Driven Continuous-Time Linear Quadratic Regulator via Closed-Loop and Reinforcement Learning Parameterizations
The authors adapt closed-loop and IRL parameterizations to continuous time, deriving policy iteration schemes, a data-driven CARE, convex reformulations, and a policy gradient flow while unifying the two approaches.