A primal-dual online framework updates policies from closed-loop data for SDP-based control synthesis in linear discrete-time systems, with local linear tracking and global ergodic convergence guarantees under persistency of excitation and slow data variation.
Policy Gradient Adaptive Control for the
5 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 5representative citing papers
A reference-decoupled reformulation makes direct data-driven LQT equivalent to certainty-equivalence solutions and supports convergent offline and online DeePO algorithms.
Model-based policy gradient converges globally to the optimal scalar LQR gain for discounted LQR using overparameterized ReLU networks by reducing the controller to two effective gains on positive and negative half-lines.
Primal-dual robust linear regression enables O(1/epsilon) sample complexity for model-free policy gradient methods on stochastic LQR.
LMS estimation paired with certainty-equivalent LQR delivers finite-gain ℓ²-stability for linear systems with unknown time-varying parameters and disturbances.
citing papers explorer
-
A Data-Enabled Primal-Dual Approach for Policy Learning with SDP Formulations
A primal-dual online framework updates policies from closed-loop data for SDP-based control synthesis in linear discrete-time systems, with local linear tracking and global ergodic convergence guarantees under persistency of excitation and slow data variation.
-
Direct Data-Driven Linear Quadratic Tracking via Policy Optimization
A reference-decoupled reformulation makes direct data-driven LQT equivalent to certainty-equivalence solutions and supports convergent offline and online DeePO algorithms.
-
Global Convergence of Policy Gradient Methods for ReLU Controllers in Linear Quadratic Regulation
Model-based policy gradient converges globally to the optimal scalar LQR gain for discounted LQR using overparameterized ReLU networks by reducing the controller to two effective gains on positive and negative half-lines.
-
Sample-Efficient Model-Free Policy Gradient Methods for Stochastic LQR via Robust Linear Regression
Primal-dual robust linear regression enables O(1/epsilon) sample complexity for model-free policy gradient methods on stochastic LQR.
-
Stability of Certainty-Equivalent Adaptive LQR for Linear Systems with Unknown Time-Varying Parameters
LMS estimation paired with certainty-equivalent LQR delivers finite-gain ℓ²-stability for linear systems with unknown time-varying parameters and disturbances.