A gradient flow on a continuous-time Bellman error parametrized by feedback gain converges to the optimal LQR controller and stays inside the stabilizing region.
Policy Gradient-based Algorithms for Continuous-time Linear Quadratic Control
5 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 5representative citing papers
An indirect optimization method inner-approximates the set of passivating state-feedback gains for continuous-time LTI systems by a convex polytope and uses projected gradient flow to minimize the LQR cost inside that polytope.
Riemannian regularization reshapes the policy optimization landscape to enable learning of Kalman gains from data under unknown and rank-deficient covariances with non-asymptotic convergence guarantees.
A convex data-driven formulation yields the optimal LQI feedback gain for continuous-time systems directly from measured data without system matrices.
The authors adapt closed-loop and IRL parameterizations to continuous time, deriving policy iteration schemes, a data-driven CARE, convex reformulations, and a policy gradient flow while unifying the two approaches.
citing papers explorer
-
Towards Optimal Passive Feedback Control of LTI Systems under LQR Performance
An indirect optimization method inner-approximates the set of passivating state-feedback gains for continuous-time LTI systems by a convex polytope and uses projected gradient flow to minimize the LQR cost inside that polytope.
-
Data-Driven Continuous-Time Linear Quadratic Regulator via Closed-Loop and Reinforcement Learning Parameterizations
The authors adapt closed-loop and IRL parameterizations to continuous time, deriving policy iteration schemes, a data-driven CARE, convex reformulations, and a policy gradient flow while unifying the two approaches.