Policy Gradient-based Algorithms for Continuous-time Linear Quadratic Control

arXiv: · 2006 · arXiv 2006.09178

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

representative citing papers

Bridging Continuous-time LQR and Reinforcement Learning via Gradient Flow of the Bellman Error

eess.SY · 2025-06-11 · unverdicted · novelty 7.0

A gradient flow on a continuous-time Bellman error parametrized by feedback gain converges to the optimal LQR controller and stays inside the stabilizing region.

Towards Optimal Passive Feedback Control of LTI Systems under LQR Performance

math.OC · 2026-04-16 · unverdicted · novelty 6.0

An indirect optimization method inner-approximates the set of passivating state-feedback gains for continuous-time LTI systems by a convex polytope and uses projected gradient flow to minimize the LQR cost inside that polytope.

Learning Kalman Policy for Singular Unknown Covariances via Riemannian Regularization

eess.SY · 2026-04-06 · unverdicted · novelty 6.0

Riemannian regularization reshapes the policy optimization landscape to enable learning of Kalman gains from data under unknown and rank-deficient covariances with non-asymptotic convergence guarantees.

Data-driven Linear Quadratic Integral Control: A Convex Formulation and Policy Gradient Approach

eess.SY · 2026-04-16 · unverdicted · novelty 5.0

A convex data-driven formulation yields the optimal LQI feedback gain for continuous-time systems directly from measured data without system matrices.

Data-Driven Continuous-Time Linear Quadratic Regulator via Closed-Loop and Reinforcement Learning Parameterizations

math.OC · 2026-04-30 · unverdicted · novelty 4.0

The authors adapt closed-loop and IRL parameterizations to continuous time, deriving policy iteration schemes, a data-driven CARE, convex reformulations, and a policy gradient flow while unifying the two approaches.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Towards Optimal Passive Feedback Control of LTI Systems under LQR Performance math.OC · 2026-04-16 · unverdicted · none · ref 4
An indirect optimization method inner-approximates the set of passivating state-feedback gains for continuous-time LTI systems by a convex polytope and uses projected gradient flow to minimize the LQR cost inside that polytope.
Data-Driven Continuous-Time Linear Quadratic Regulator via Closed-Loop and Reinforcement Learning Parameterizations math.OC · 2026-04-30 · unverdicted · none · ref 41
The authors adapt closed-loop and IRL parameterizations to continuous time, deriving policy iteration schemes, a data-driven CARE, convex reformulations, and a policy gradient flow while unifying the two approaches.

Policy Gradient-based Algorithms for Continuous-time Linear Quadratic Control

fields

years

verdicts

representative citing papers

citing papers explorer