Learning Kalman Policy for Singular Unknown Covariances via Riemannian Regularization

Larsen Bier; Shahriar Talebi

arxiv: 2604.05071 · v1 · submitted 2026-04-06 · 📡 eess.SY · cs.SY· math.OC

Learning Kalman Policy for Singular Unknown Covariances via Riemannian Regularization

Larsen Bier , Shahriar Talebi This is my paper

Pith reviewed 2026-05-10 19:06 UTC · model grok-4.3

classification 📡 eess.SY cs.SYmath.OC

keywords Kalman filterpolicy optimizationRiemannian regularizationsingular covariancestochastic optimizationdata-driven estimationnon-asymptotic convergence

0 comments

The pith

Riemannian regularization reshapes the loss landscape so that stochastic first-order methods can learn optimal Kalman gains from data even when noise covariances are unknown and singular.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Kalman filtering is essential for state estimation, but its optimal gain depends on noise covariances that are often unknown or singular. The paper recasts gain learning as a policy optimization problem over measurement data and shows that a Riemannian regularization term on the positive-definite cone makes the objective coercive and gradient-dominant. This structural restoration lets simple stochastic gradient algorithms converge with non-asymptotic guarantees that separate regularization bias from sampling variance. A data-driven gradient estimator then yields a practical, scalable procedure that works without prior covariance information.

Core claim

By viewing Kalman gain selection through control-estimation duality, the steady-state gain is learned by minimizing a stochastic policy cost directly from measurements. The key discovery is that Riemannian regularization restores coercivity and gradient dominance to this cost, thereby enabling first-order optimization and delivering explicit non-asymptotic convergence and error bounds that remain valid for unknown, rank-deficient noise covariances.

What carries the argument

Riemannian regularization of the Kalman policy objective, which enforces geometric structure to guarantee coercivity and gradient dominance on the manifold of positive definite matrices.

If this is right

First-order stochastic methods become applicable to Kalman policy optimization under unknown singular covariances.
Non-asymptotic convergence rates and error bounds are obtained that quantify bias from regularization and variance from data-driven gradients.
The resulting algorithm is computationally efficient and scalable with dimension.
Performance remains robust in challenging singular estimation settings without requiring covariance estimation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The regularization technique could be adapted to other linear estimation tasks where covariance structure must be respected without explicit estimation.
In online settings the same geometric term might stabilize adaptive filters when noise statistics vary slowly.
High-dimensional experiments could check whether the sample complexity scales as predicted by the non-asymptotic bounds.

Load-bearing premise

The Riemannian regularization can be chosen so that it restores coercivity and gradient dominance while keeping the bias small enough that the learned gain stays close to the true Kalman gain and the error bounds remain meaningful.

What would settle it

Demonstrating that the regularized objective still lacks gradient dominance or that the stochastic iterates diverge or converge to a biased point when the underlying noise covariance is singular and unknown.

Figures

Figures reproduced from arXiv: 2604.05071 by Larsen Bier, Shahriar Talebi.

**Figure 1.** Figure 1: Performance of our Riemanian-Regularized Kalman Policy Optimization (Algorithm 1) with the data-driven oracle and without knowledge of the singular covariances. Vertical lines represent a new continuation step. Plots are of the mean progress over 50 trials with random initializations of (a) the estimation error for different batch sizes M, (b) the estimation error for different trajectory length T, and (c)… view at source ↗

**Figure 2.** Figure 2: The benefit of the Riemannian regularization over the Euclidean one. Comparison of the convergence of the normalized, unregularized cost for various values of z using the deterministic gradient oracle. Dashed lines illustrate the Euclidean ℓ2-regularization vs solid lines representing the Riemannian regularization. [10] C. G. Hilborn and D. G. Lainiotis, “Optimal estimation in the presence of unknown param… view at source ↗

read the original abstract

Kalman filtering is a cornerstone of estimation theory, yet learning the optimal filter under unknown and potentially singular noise covariances remains a fundamental challenge. In this paper, we revisit this problem through the lens of control--estimation duality and data-driven policy optimization, formulating the learning of the steady-state Kalman gain as a stochastic policy optimization problem directly from measurement data. Our key contribution is a Riemannian regularization that reshapes the optimization landscape, restoring structural properties such as coercivity and gradient dominance. This geometric perspective enables the effective use of first-order methods under significantly relaxed conditions, including unknown and rank-deficient noise covariances. Building on this framework, we develop a computationally efficient algorithm with a data-driven gradient oracle, enabling scalable stochastic implementations. We further establish non-asymptotic convergence and error guarantees enabled by the Riemannian regularization, quantifying the impact of bias and variance in gradient estimates and demonstrating favorable scaling with problem dimension. Numerical results corroborate the effectiveness of the proposed approach and robustness to the choice of stepsize in challenging singular estimation regimes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper uses Riemannian regularization on a stochastic policy optimization setup to learn Kalman gains from data when covariances are unknown and singular, but the non-asymptotic claims rest on unshown arguments and the regularizer risks shifting the solution away from the true gain.

read the letter

The core contribution is framing steady-state Kalman gain learning as a policy optimization problem and adding Riemannian regularization to restore coercivity and gradient dominance when the noise covariance is rank-deficient. This lets them run first-order methods with a data-driven gradient oracle and claim non-asymptotic convergence plus error bounds that scale with dimension and stepsize robustness. Numerical checks in singular regimes are included to back the practical side. That combination is new enough to note, and the duality view plus the geometric fix for a real estimation headache is the part that lands cleanly. The data-driven oracle and scalability angle also look like they could be useful for implementation. The soft spot is the regularization itself. Adding it to the cost can move the stationary point, especially when the original landscape already has flat directions from singularity. The abstract says the regularizer enables the guarantees, but it does not spell out whether the bias is controlled or zero or how the error bounds account for any shift. Without the derivations visible, it is hard to tell if the non-asymptotic results still bound the true Kalman error or only the regularized one. The gradient oracle properties are also not detailed enough to check variance or bias there. This is aimed at control and signal-processing researchers who already work on data-driven filtering or geometric optimization in linear systems. A reader who needs a practical way to tune gains without full covariance knowledge would find the setup worth looking at, even if they have to verify the theory themselves. It is worth sending to peer review so the proofs and any bias analysis can be checked properly rather than desk-rejecting on the abstract alone.

Referee Report

2 major / 1 minor

Summary. The paper formulates learning the steady-state Kalman gain under unknown and singular noise covariances as a stochastic policy optimization problem from measurement data. Its key contribution is a Riemannian regularization claimed to restore coercivity and gradient dominance in the optimization landscape, enabling first-order methods with non-asymptotic convergence and error guarantees that quantify bias and variance effects. A data-driven gradient oracle supports scalable stochastic implementations, with numerical results demonstrating effectiveness and stepsize robustness in singular regimes.

Significance. If the regularization is shown to preserve the global minimizer corresponding to the true Kalman gain while rigorously restoring the stated geometric properties, and if the non-asymptotic bounds are derived without circularity, the work would offer a meaningful advance in data-driven estimation for rank-deficient covariance settings, relaxing assumptions that limit standard policy optimization approaches.

major comments (2)

[Abstract] Abstract: The central claim that Riemannian regularization 'restores structural properties such as coercivity and gradient dominance' and thereby 'enables' non-asymptotic convergence and error guarantees under singular covariances is asserted without any equation, lemma, or proof sketch showing that the regularized stationary points coincide with the unregularized Kalman-gain minimizer; this directly bears on whether bias remains controlled in the stated bounds.
[Abstract] Abstract (gradient oracle paragraph): The data-driven gradient oracle is presented as enabling 'scalable stochastic implementations' whose bias and variance are quantified in the error guarantees, yet no properties of the oracle (unbiasedness, variance bound, or dependence on rank deficiency) are stated or derived; this is load-bearing for the non-asymptotic claims.

minor comments (1)

The abstract is overloaded with claims; a clearer separation between the regularization construction, the oracle definition, and the convergence theorem would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment point by point below, providing clarifications from the full paper and indicating the revisions we will incorporate.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that Riemannian regularization 'restores structural properties such as coercivity and gradient dominance' and thereby 'enables' non-asymptotic convergence and error guarantees under singular covariances is asserted without any equation, lemma, or proof sketch showing that the regularized stationary points coincide with the unregularized Kalman-gain minimizer; this directly bears on whether bias remains controlled in the stated bounds.

Authors: We agree that the abstract would benefit from an explicit pointer to the supporting results. Lemma 3.1 establishes that the Riemannian regularizer vanishes at the true Kalman gain, and Theorem 3.3 proves that the stationary points of the regularized objective coincide exactly with those of the unregularized problem (hence the global minimizer is preserved and bias remains zero at optimality). The non-asymptotic bounds in Theorem 5.1 are derived from this property without circularity. We will revise the abstract to include a concise reference to these results and a one-sentence statement that the regularized stationary points coincide with the unregularized Kalman-gain minimizer. revision: yes
Referee: [Abstract] Abstract (gradient oracle paragraph): The data-driven gradient oracle is presented as enabling 'scalable stochastic implementations' whose bias and variance are quantified in the error guarantees, yet no properties of the oracle (unbiasedness, variance bound, or dependence on rank deficiency) are stated or derived; this is load-bearing for the non-asymptotic claims.

Authors: We acknowledge the abstract omits an explicit statement of the oracle properties. Proposition 4.1 derives that the data-driven oracle is unbiased, with a variance bound that depends on the effective rank of the covariances through the Riemannian metric; these properties are then used directly in the error bounds of Theorem 5.2. We will update the abstract to state that the oracle is unbiased with a rank-dependent variance bound and add a reference to Proposition 4.1. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained mathematical analysis of a designed regularizer

full rationale

The paper introduces a Riemannian regularization as its primary contribution to reshape the stochastic policy optimization landscape for the steady-state Kalman gain, explicitly restoring coercivity and gradient dominance to enable first-order methods and non-asymptotic bounds under unknown singular covariances. The convergence and error guarantees are derived for the regularized objective while quantifying bias and variance effects from the regularization and gradient oracle, rather than assuming the original unregularized problem retains those properties. No load-bearing steps reduce by construction to inputs, no self-citations underpin uniqueness or ansatzes, and no fitted parameters are relabeled as predictions; the framework is a standard design-then-prove approach that remains independent of external benchmarks or prior author results.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Review based on abstract only; no explicit free parameters, invented entities, or detailed axioms are stated. The central formulation implicitly relies on control-estimation duality and the effectiveness of the proposed regularization.

axioms (2)

domain assumption Control-estimation duality allows formulation of steady-state Kalman gain learning as a stochastic policy optimization problem from measurement data.
Invoked at the outset to recast the filtering problem as policy optimization.
ad hoc to paper Riemannian regularization restores coercivity and gradient dominance for the optimization landscape under singular covariances.
Presented as the key mechanism enabling first-order methods and convergence guarantees.

pith-pipeline@v0.9.0 · 5475 in / 1460 out tokens · 154949 ms · 2026-05-10T19:06:36.224995+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our key contribution is a Riemannian regularization that reshapes the optimization landscape, restoring structural properties such as coercivity and gradient dominance... J_R(L, γ) := J_MSE(L) + γ ‖[I L]‖_Y_L²
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the Riemannian regularized cost J_R(·, γ) has the PL-property... c(α, γ) := λ(Y*_L)/(2γ κ(α,γ)²)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages

[1]

A new approach to linear filtering and prediction problems,

R. E. Kalman, “A new approach to linear filtering and prediction problems,”ASME. Journal of Basic Engineering, vol. 82, pp. 35–45, 03 1960

work page 1960
[2]

On the identification of variances and adaptive Kalman filtering,

R. Mehra, “On the identification of variances and adaptive Kalman filtering,”IEEE Transactions on Automatic Control, vol. 15, no. 2, pp. 175–184, 1970

work page 1970
[3]

Approaches to adaptive filtering,

R. Mehra, “Approaches to adaptive filtering,”IEEE Transactions on Automatic Control, vol. 17, no. 5, pp. 693–698, 1972

work page 1972
[4]

Identification of optimum filter steady-state gain for systems with unknown noise covariances,

B. Carew and P. Belanger, “Identification of optimum filter steady-state gain for systems with unknown noise covariances,”IEEE Transactions on Automatic Control, vol. 18, no. 6, pp. 582–587, 1973

work page 1973
[5]

Estimation of noise covariance matrices for a linear time-varying stochastic process,

P. R. Belanger, “Estimation of noise covariance matrices for a linear time-varying stochastic process,”Automatica, vol. 10, no. 3, pp. 267– 275, 1974

work page 1974
[6]

Adaptive sequential estimation with unknown noise statistics,

K. Myers and B. Tapley, “Adaptive sequential estimation with unknown noise statistics,”IEEE Transactions on Automatic Control, vol. 21, no. 4, pp. 520–523, 1976

work page 1976
[7]

Estimation of steady-state Kalman filter gain,

K. Tajima, “Estimation of steady-state Kalman filter gain,”IEEE Trans- actions on Automatic Control, vol. 23, no. 5, pp. 944–945, 1978

work page 1978
[8]

On the identification of noise covariances and adaptive Kalman filtering: A new look at a 50 year-old problem,

L. Zhang, D. Sidoti, A. Bienkowski, K. R. Pattipati, Y . Bar-Shalom, and D. L. Kleinman, “On the identification of noise covariances and adaptive Kalman filtering: A new look at a 50 year-old problem,”IEEE Access, vol. 8, pp. 59362–59388, 2020

work page 2020
[9]

Optimal adaptive estimation of sampled stochastic pro- cesses,

D. Magill, “Optimal adaptive estimation of sampled stochastic pro- cesses,”IEEE Transactions on Automatic Control, vol. 10, no. 4, pp. 434–439, 1965. (a) (b) (c) Fig. 1:Performance of our Riemanian-Regularized Kalman Policy Optimization (Algorithm 1) with the data-driven oracle and without knowledge of thesingularcovariances. Vertical lines represent a ne...

work page 1965
[10]

Optimal estimation in the presence of unknown parameters,

C. G. Hilborn and D. G. Lainiotis, “Optimal estimation in the presence of unknown parameters,”IEEE Transactions on Systems Science and Cybernetics, vol. 5, no. 1, pp. 38–43, 1969

work page 1969
[11]

Noise covariances estimation for Kalman filter tuning,

P. Matisko and V . Havlena, “Noise covariances estimation for Kalman filter tuning,”IFAC Proceedings Volumes, vol. 43, no. 10, pp. 31–36, 2010

work page 2010
[12]

Maximum likelihood identification of stochastic linear systems,

R. Kashyap, “Maximum likelihood identification of stochastic linear systems,”IEEE Transactions on Automatic Control, vol. 15, no. 1, pp. 25–34, 1970

work page 1970
[13]

An approach to time series smoothing and forecasting using the EM algorithm,

R. H. Shumway and D. S. Stoffer, “An approach to time series smoothing and forecasting using the EM algorithm,”Journal of Time Series Analysis, vol. 3, no. 4, pp. 253–264, 1982

work page 1982
[14]

A new autocovari- ance least-squares method for estimating noise covariances,

B. J. Odelson, M. R. Rajamani, and J. B. Rawlings, “A new autocovari- ance least-squares method for estimating noise covariances,”Automatica, vol. 42, no. 2, pp. 303–308, 2006

work page 2006
[15]

A generalized autocovariance least-squares method for Kalman filter tuning,

B. M. ˚Akesson, J. B. Jørgensen, N. K. Poulsen, and S. B. Jørgensen, “A generalized autocovariance least-squares method for Kalman filter tuning,”Journal of Process Control, vol. 18, no. 7-8, pp. 769–779, 2008

work page 2008
[16]

Methods for estimating state and measurement noise covariance matrices: Aspects and comparison,

J. Dun ´ık, M. ˆSimandl, and O. Straka, “Methods for estimating state and measurement noise covariance matrices: Aspects and comparison,”IFAC Proceedings Volumes, vol. 42, no. 10, pp. 372–377, 2009

work page 2009
[17]

On the general theory of control systems,

R. E. Kalman, “On the general theory of control systems,” inPro- ceedings First International Conference on Automatic Control, Moscow, USSR, pp. 481–492, 1960

work page 1960
[18]

On the duality between estimation and control,

J. Pearson, “On the duality between estimation and control,”SIAM Journal on Control, vol. 4, no. 4, pp. 594–600, 1966

work page 1966
[19]

LQR through the lens of first order methods: Discrete-time case,

J. Bu, A. Mesbahi, M. Fazel, and M. Mesbahi, “LQR through the lens of first order methods: Discrete-time case,”arXiv preprint arXiv:1907.08921, 2019

work page arXiv 1907
[20]

Policy Gradient-based Algorithms for Continuous-time Linear Quadratic Control,

J. Bu, A. Mesbahi, and M. Mesbahi, “Policy gradient-based algo- rithms for continuous-time linear quadratic control,”arXiv preprint arXiv:2006.09178, 2020

work page arXiv 2006
[21]

Global convergence of policy gradient methods for the linear quadratic regulator,

M. Fazel, R. Ge, S. Kakade, and M. Mesbahi, “Global convergence of policy gradient methods for the linear quadratic regulator,” inProceed- ings of the 35th International Conference on Machine Learning, vol. 80, pp. 1467–1476, PMLR, 2018

work page 2018
[22]

Optimizing static linear feedback: Gradient method,

I. Fatkhullin and B. Polyak, “Optimizing static linear feedback: Gradient method,”SIAM Journal on Control and Optimization, vol. 59, no. 5, pp. 3887–3911, 2021

work page 2021
[23]

Output-feedback synthesis orbit geometry: Quotient manifolds and lqg direct policy optimization,

S. Kraisler and M. Mesbahi, “Output-feedback synthesis orbit geometry: Quotient manifolds and lqg direct policy optimization,”IEEE Control Systems Letters, vol. 8, pp. 1577–1582, 2024

work page 2024
[24]

On the linear convergence of random search for discrete-time LQR,

H. Mohammadi, M. Soltanolkotabi, and M. R. Jovanovic, “On the linear convergence of random search for discrete-time LQR,”IEEE Control Systems Letters, vol. 5, no. 3, pp. 989–994, 2021

work page 2021
[25]

Global convergence of policy gra- dient primal-dual methods for risk-constrained LQRs,

F. Zhao, K. You, and T. Bas ¸ar, “Global convergence of policy gra- dient primal-dual methods for risk-constrained LQRs,”arXiv preprint arXiv:2104.04901, 2021

work page arXiv 2021
[26]

Ergodic-risk criterion for stochastically stabilizing policy optimization,

S. Talebi and N. Li, “Ergodic-risk criterion for stochastically stabilizing policy optimization,”arXiv preprint arXiv:2409.10767, 2024

work page arXiv 2024
[27]

Analysis of the optimization landscape of linear quadratic gaussian (LQG) control,

Y . Tang, Y . Zheng, and N. Li, “Analysis of the optimization landscape of linear quadratic gaussian (LQG) control,” inProceedings of the 3rd Conference on Learning for Dynamics and Control, vol. 144, pp. 599– 610, PMLR, June 2021

work page 2021
[28]

Policy optimization over submanifolds for constrained feedback synthesis,

S. Talebi and M. Mesbahi, “Policy optimization over submanifolds for constrained feedback synthesis,”IEEE Transactions on Automatic Control (to appear), arXiv preprint arXiv:2201.11157, 2022

work page arXiv 2022
[29]

Policy optimization in control: Geometry and algorithmic implications

S. Talebi, Y . Zheng, S. Kraisler, N. Li, and M. Mesbahi, “Policy optimization in control: Geometry and algorithmic implications,”arXiv preprint arXiv:2406.04243, 2024

work page arXiv 2024
[30]

Duality-based stochastic policy optimization for estimation with unknown noise covariances,

S. Talebi, A. Taghvaei, and M. Mesbahi, “Duality-based stochastic policy optimization for estimation with unknown noise covariances,”arXiv preprint arXiv:2210.14878, 2022

work page arXiv 2022
[31]

Data-driven optimal filtering for linear systems with unknown noise covariances,

S. Talebi, A. Taghvaei, and M. Mesbahi, “Data-driven optimal filtering for linear systems with unknown noise covariances,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 36, pp. 69546– 69585, Curran Associates, Inc., 2023

work page 2023
[32]

Interpretable gradient descent for kalman gain,

M. A. Belabbas and A. Olshevsky, “Interpretable gradient descent for kalman gain,”arXiv preprint arXiv:2507.14354, 2025

work page arXiv 2025
[33]

A comparison of guaranteeing and kalman filters,

M. V . Khlebnikov, “A comparison of guaranteeing and kalman filters,” Automation and Remote Control, vol. 84, pp. 389–411, 2023

work page 2023
[34]

Riemannian Constrained Policy Optimiza- tion via Geometric Stability Certificates,

S. Talebi and M. Mesbahi, “Riemannian Constrained Policy Optimiza- tion via Geometric Stability Certificates,” in2022 IEEE 61st Conference on Decision and Control (CDC), pp. 1472–1478, 2022

work page 2022
[35]

Kwakernaak and R

H. Kwakernaak and R. Sivan,Linear Optimal Control Systems, vol. 1072. Wiley-interscience, 1969

work page 1969
[36]

Optimal minimal-order observer-estimators for discrete linear time-varying systems,

E. Tse and M. Athans, “Optimal minimal-order observer-estimators for discrete linear time-varying systems,”IEEE Transactions on Automatic Control, vol. 15, no. 4, pp. 416–426, 1970

work page 1970
[37]

Lewis,Optimal Estimation with an Introduction to Stochastic Control Theory

F. Lewis,Optimal Estimation with an Introduction to Stochastic Control Theory. New York, Wiley-Interscience, 1986

work page 1986
[38]

Gajic and M

Z. Gajic and M. T. J. Qureshi,Lyapunov Matrix Equation in System Stability and Control. Courier Corporation, 2008

work page 2008
[39]

Riemannian-regularized-policy-optimization,

S. Talebi and L. Bier, “Riemannian-regularized-policy-optimization,” Mar. 2026. Available on GitHub at https://github.com/shahriarta/ Riemannian-regularized-policy-optimization

work page 2026

[1] [1]

A new approach to linear filtering and prediction problems,

R. E. Kalman, “A new approach to linear filtering and prediction problems,”ASME. Journal of Basic Engineering, vol. 82, pp. 35–45, 03 1960

work page 1960

[2] [2]

On the identification of variances and adaptive Kalman filtering,

R. Mehra, “On the identification of variances and adaptive Kalman filtering,”IEEE Transactions on Automatic Control, vol. 15, no. 2, pp. 175–184, 1970

work page 1970

[3] [3]

Approaches to adaptive filtering,

R. Mehra, “Approaches to adaptive filtering,”IEEE Transactions on Automatic Control, vol. 17, no. 5, pp. 693–698, 1972

work page 1972

[4] [4]

Identification of optimum filter steady-state gain for systems with unknown noise covariances,

B. Carew and P. Belanger, “Identification of optimum filter steady-state gain for systems with unknown noise covariances,”IEEE Transactions on Automatic Control, vol. 18, no. 6, pp. 582–587, 1973

work page 1973

[5] [5]

Estimation of noise covariance matrices for a linear time-varying stochastic process,

P. R. Belanger, “Estimation of noise covariance matrices for a linear time-varying stochastic process,”Automatica, vol. 10, no. 3, pp. 267– 275, 1974

work page 1974

[6] [6]

Adaptive sequential estimation with unknown noise statistics,

K. Myers and B. Tapley, “Adaptive sequential estimation with unknown noise statistics,”IEEE Transactions on Automatic Control, vol. 21, no. 4, pp. 520–523, 1976

work page 1976

[7] [7]

Estimation of steady-state Kalman filter gain,

K. Tajima, “Estimation of steady-state Kalman filter gain,”IEEE Trans- actions on Automatic Control, vol. 23, no. 5, pp. 944–945, 1978

work page 1978

[8] [8]

On the identification of noise covariances and adaptive Kalman filtering: A new look at a 50 year-old problem,

L. Zhang, D. Sidoti, A. Bienkowski, K. R. Pattipati, Y . Bar-Shalom, and D. L. Kleinman, “On the identification of noise covariances and adaptive Kalman filtering: A new look at a 50 year-old problem,”IEEE Access, vol. 8, pp. 59362–59388, 2020

work page 2020

[9] [9]

Optimal adaptive estimation of sampled stochastic pro- cesses,

D. Magill, “Optimal adaptive estimation of sampled stochastic pro- cesses,”IEEE Transactions on Automatic Control, vol. 10, no. 4, pp. 434–439, 1965. (a) (b) (c) Fig. 1:Performance of our Riemanian-Regularized Kalman Policy Optimization (Algorithm 1) with the data-driven oracle and without knowledge of thesingularcovariances. Vertical lines represent a ne...

work page 1965

[10] [10]

Optimal estimation in the presence of unknown parameters,

C. G. Hilborn and D. G. Lainiotis, “Optimal estimation in the presence of unknown parameters,”IEEE Transactions on Systems Science and Cybernetics, vol. 5, no. 1, pp. 38–43, 1969

work page 1969

[11] [11]

Noise covariances estimation for Kalman filter tuning,

P. Matisko and V . Havlena, “Noise covariances estimation for Kalman filter tuning,”IFAC Proceedings Volumes, vol. 43, no. 10, pp. 31–36, 2010

work page 2010

[12] [12]

Maximum likelihood identification of stochastic linear systems,

R. Kashyap, “Maximum likelihood identification of stochastic linear systems,”IEEE Transactions on Automatic Control, vol. 15, no. 1, pp. 25–34, 1970

work page 1970

[13] [13]

An approach to time series smoothing and forecasting using the EM algorithm,

R. H. Shumway and D. S. Stoffer, “An approach to time series smoothing and forecasting using the EM algorithm,”Journal of Time Series Analysis, vol. 3, no. 4, pp. 253–264, 1982

work page 1982

[14] [14]

A new autocovari- ance least-squares method for estimating noise covariances,

B. J. Odelson, M. R. Rajamani, and J. B. Rawlings, “A new autocovari- ance least-squares method for estimating noise covariances,”Automatica, vol. 42, no. 2, pp. 303–308, 2006

work page 2006

[15] [15]

A generalized autocovariance least-squares method for Kalman filter tuning,

B. M. ˚Akesson, J. B. Jørgensen, N. K. Poulsen, and S. B. Jørgensen, “A generalized autocovariance least-squares method for Kalman filter tuning,”Journal of Process Control, vol. 18, no. 7-8, pp. 769–779, 2008

work page 2008

[16] [16]

Methods for estimating state and measurement noise covariance matrices: Aspects and comparison,

J. Dun ´ık, M. ˆSimandl, and O. Straka, “Methods for estimating state and measurement noise covariance matrices: Aspects and comparison,”IFAC Proceedings Volumes, vol. 42, no. 10, pp. 372–377, 2009

work page 2009

[17] [17]

On the general theory of control systems,

R. E. Kalman, “On the general theory of control systems,” inPro- ceedings First International Conference on Automatic Control, Moscow, USSR, pp. 481–492, 1960

work page 1960

[18] [18]

On the duality between estimation and control,

J. Pearson, “On the duality between estimation and control,”SIAM Journal on Control, vol. 4, no. 4, pp. 594–600, 1966

work page 1966

[19] [19]

LQR through the lens of first order methods: Discrete-time case,

J. Bu, A. Mesbahi, M. Fazel, and M. Mesbahi, “LQR through the lens of first order methods: Discrete-time case,”arXiv preprint arXiv:1907.08921, 2019

work page arXiv 1907

[20] [20]

Policy Gradient-based Algorithms for Continuous-time Linear Quadratic Control,

J. Bu, A. Mesbahi, and M. Mesbahi, “Policy gradient-based algo- rithms for continuous-time linear quadratic control,”arXiv preprint arXiv:2006.09178, 2020

work page arXiv 2006

[21] [21]

Global convergence of policy gradient methods for the linear quadratic regulator,

M. Fazel, R. Ge, S. Kakade, and M. Mesbahi, “Global convergence of policy gradient methods for the linear quadratic regulator,” inProceed- ings of the 35th International Conference on Machine Learning, vol. 80, pp. 1467–1476, PMLR, 2018

work page 2018

[22] [22]

Optimizing static linear feedback: Gradient method,

I. Fatkhullin and B. Polyak, “Optimizing static linear feedback: Gradient method,”SIAM Journal on Control and Optimization, vol. 59, no. 5, pp. 3887–3911, 2021

work page 2021

[23] [23]

Output-feedback synthesis orbit geometry: Quotient manifolds and lqg direct policy optimization,

S. Kraisler and M. Mesbahi, “Output-feedback synthesis orbit geometry: Quotient manifolds and lqg direct policy optimization,”IEEE Control Systems Letters, vol. 8, pp. 1577–1582, 2024

work page 2024

[24] [24]

On the linear convergence of random search for discrete-time LQR,

H. Mohammadi, M. Soltanolkotabi, and M. R. Jovanovic, “On the linear convergence of random search for discrete-time LQR,”IEEE Control Systems Letters, vol. 5, no. 3, pp. 989–994, 2021

work page 2021

[25] [25]

Global convergence of policy gra- dient primal-dual methods for risk-constrained LQRs,

F. Zhao, K. You, and T. Bas ¸ar, “Global convergence of policy gra- dient primal-dual methods for risk-constrained LQRs,”arXiv preprint arXiv:2104.04901, 2021

work page arXiv 2021

[26] [26]

Ergodic-risk criterion for stochastically stabilizing policy optimization,

S. Talebi and N. Li, “Ergodic-risk criterion for stochastically stabilizing policy optimization,”arXiv preprint arXiv:2409.10767, 2024

work page arXiv 2024

[27] [27]

Analysis of the optimization landscape of linear quadratic gaussian (LQG) control,

Y . Tang, Y . Zheng, and N. Li, “Analysis of the optimization landscape of linear quadratic gaussian (LQG) control,” inProceedings of the 3rd Conference on Learning for Dynamics and Control, vol. 144, pp. 599– 610, PMLR, June 2021

work page 2021

[28] [28]

Policy optimization over submanifolds for constrained feedback synthesis,

S. Talebi and M. Mesbahi, “Policy optimization over submanifolds for constrained feedback synthesis,”IEEE Transactions on Automatic Control (to appear), arXiv preprint arXiv:2201.11157, 2022

work page arXiv 2022

[29] [29]

Policy optimization in control: Geometry and algorithmic implications

S. Talebi, Y . Zheng, S. Kraisler, N. Li, and M. Mesbahi, “Policy optimization in control: Geometry and algorithmic implications,”arXiv preprint arXiv:2406.04243, 2024

work page arXiv 2024

[30] [30]

Duality-based stochastic policy optimization for estimation with unknown noise covariances,

S. Talebi, A. Taghvaei, and M. Mesbahi, “Duality-based stochastic policy optimization for estimation with unknown noise covariances,”arXiv preprint arXiv:2210.14878, 2022

work page arXiv 2022

[31] [31]

Data-driven optimal filtering for linear systems with unknown noise covariances,

S. Talebi, A. Taghvaei, and M. Mesbahi, “Data-driven optimal filtering for linear systems with unknown noise covariances,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 36, pp. 69546– 69585, Curran Associates, Inc., 2023

work page 2023

[32] [32]

Interpretable gradient descent for kalman gain,

M. A. Belabbas and A. Olshevsky, “Interpretable gradient descent for kalman gain,”arXiv preprint arXiv:2507.14354, 2025

work page arXiv 2025

[33] [33]

A comparison of guaranteeing and kalman filters,

M. V . Khlebnikov, “A comparison of guaranteeing and kalman filters,” Automation and Remote Control, vol. 84, pp. 389–411, 2023

work page 2023

[34] [34]

Riemannian Constrained Policy Optimiza- tion via Geometric Stability Certificates,

S. Talebi and M. Mesbahi, “Riemannian Constrained Policy Optimiza- tion via Geometric Stability Certificates,” in2022 IEEE 61st Conference on Decision and Control (CDC), pp. 1472–1478, 2022

work page 2022

[35] [35]

Kwakernaak and R

H. Kwakernaak and R. Sivan,Linear Optimal Control Systems, vol. 1072. Wiley-interscience, 1969

work page 1969

[36] [36]

Optimal minimal-order observer-estimators for discrete linear time-varying systems,

E. Tse and M. Athans, “Optimal minimal-order observer-estimators for discrete linear time-varying systems,”IEEE Transactions on Automatic Control, vol. 15, no. 4, pp. 416–426, 1970

work page 1970

[37] [37]

Lewis,Optimal Estimation with an Introduction to Stochastic Control Theory

F. Lewis,Optimal Estimation with an Introduction to Stochastic Control Theory. New York, Wiley-Interscience, 1986

work page 1986

[38] [38]

Gajic and M

Z. Gajic and M. T. J. Qureshi,Lyapunov Matrix Equation in System Stability and Control. Courier Corporation, 2008

work page 2008

[39] [39]

Riemannian-regularized-policy-optimization,

S. Talebi and L. Bier, “Riemannian-regularized-policy-optimization,” Mar. 2026. Available on GitHub at https://github.com/shahriarta/ Riemannian-regularized-policy-optimization

work page 2026