An algorithm for constrained stochastic LQR achieves tilde O of square root T regret and chance constraint satisfaction via SDP-based optimistic policies scaled for safety.
Then, (24) implies thatdλ −1ν(µ+κ 2γ−1ηD2)≤ ϵ2 8Φ−1(1−δ)2
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
math.OC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Rate-Optimal Regret for the Safe Learning-based Control of the Constrained Linear Quadratic Regulator
An algorithm for constrained stochastic LQR achieves tilde O of square root T regret and chance constraint satisfaction via SDP-based optimistic policies scaled for safety.