arxiv: 2605.08864 · v1 · submitted 2026-05-09 · 💻 cs.LG · math.ST· stat.TH

Recognition: 2 theorem links

· Lean Theorem

Higher-Order Equilibrium Tracking for EM-Compressible Online Estimation

Zhiming Li , Yue Song

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:12 UTC · model grok-4.3

classification 💻 cs.LG math.STstat.TH

keywords online EMequilibrium trackinglatent variable modelscentral limit theoremEM-compressibilitystreaming estimationfinite-sample riskmoving optimum

0 comments

The pith

An online estimator for latent-variable models inherits the batch central limit theorem and sharp first-order risk when its tracking error behind the moving empirical optimum stays o of T to the minus one half.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper recasts online estimation in latent-variable models as tracking a moving empirical equilibrium rather than converging to a fixed population parameter. It decomposes the online estimate into the frozen batch optimum at the current running statistic plus the algorithm's tracking lag, then proves that sufficiently small lag transfers the batch central limit theorem and risk constant to the online setting. This separation matters for streaming data because it lets online algorithms retain the statistical efficiency of batch methods without requiring full recomputation. The framework introduces higher-order equilibrium-jet predictors paired with frozen correctors to achieve faster localized tracking rates under structural compressibility conditions that keep everything evaluable from retained statistics.

Core claim

The online estimate decomposes into the frozen batch equilibrium at the current running statistic and a tracking lag; provided the L2 norm of that lag is little-o of T to the minus one half, the online estimator inherits the batch central limit theorem and the sharp first-order risk constant. An m-th order equilibrium-jet predictor combined with an order-nu frozen corrector produces localized tracking rates of order T to the minus nu times (m plus one). The results rest on EM-compressibility and EM-jet-compressibility, which let the equilibrium response and Newton corrector be computed from a retained streaming statistic, as shown explicitly for latent linear Gaussian covariance estimation.

What carries the argument

The m-th order equilibrium-jet predictor paired with an order-nu frozen corrector, acting on the smooth equilibrium manifold indexed by the running statistic and enabled by EM-compressibility.

If this is right

The online estimator matches the asymptotic distribution and first-order risk of the corresponding batch estimator.
Higher-order jet predictors deliver polynomial speed-ups in how quickly the online method catches the moving target.
In the Gaussian covariance example the method runs on a compressed d by d statistic with explicit finite-sample risk bounds and a restart rule.
Analysis cleanly separates movement of the empirical optimum from algorithmic delay.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same decomposition could be applied to other drifting-target problems in stochastic approximation beyond latent-variable models.
Algorithm designers might adaptively select predictor order according to observed lag size and available compute.
The compressibility conditions suggest a route to designing new streaming estimators that retain only low-dimensional summaries.

Load-bearing premise

The empirical optimum moves smoothly on an equilibrium manifold indexed by the running statistic, and the model satisfies the EM-compressibility conditions that let responses be recovered from streaming statistics.

What would settle it

A direct comparison in which the online estimator's asymptotic variance or risk constant deviates from the batch values precisely when the observed tracking error exceeds o of T to the minus one half, or when the measured convergence rate fails to improve with higher predictor order.

Figures

Figures reproduced from arXiv: 2605.08864 by Yue Song, Zhiming Li.

**Figure 1.** Figure 1: Online estimation as equilibrium tracking. The running statistic St makes the frozen empirical equilibrium Σ ◦ (St) drift at scale O(t −1 ). Prediction extrapolates this drift, correction contracts the online state toward the new target, and the remaining terminal lag eT controls batch-toonline transfer. where r(ST ) carries the statistical fluctuation of the batch estimator and eT := ϑT − r(ST ) is the a… view at source ↗

**Figure 2.** Figure 2: Numerical validation of the tracking mechanism and batch-to-online transfer. [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Curvature/eigengap stress test. Horizontal bars show log–log slopes for 24 settings (L/M/S [PITH_FULL_IMAGE:figures/full_fig_p039_3.png] view at source ↗

**Figure 4.** Figure 4: CG tolerance ablation. Left: tracking slope vs [PITH_FULL_IMAGE:figures/full_fig_p040_4.png] view at source ↗

**Figure 5.** Figure 5: Isserlis identity: empirical Fisher covariance vs analytic formula. Gray lines: individual [PITH_FULL_IMAGE:figures/full_fig_p041_5.png] view at source ↗

**Figure 6.** Figure 6: Restart localization: fraction of replicates remaining inside the contraction tube. Random [PITH_FULL_IMAGE:figures/full_fig_p041_6.png] view at source ↗

read the original abstract

We study online estimation in latent-variable models by recasting the problem as tracking a moving empirical equilibrium. Standard online EM and stochastic approximation analyses primarily study convergence toward the population parameter and typically do not isolate the empirical batch optimum from the online tracking error at finite horizon. Our framework decomposes the online estimate into the frozen batch equilibrium at the current running statistic and a tracking lag that captures the algorithm's delay behind this moving target. We prove a batch-to-online transfer theorem: provided $\lVert e_T \rVert_{L^{2}} = o(T^{-1/2})$, the online estimator inherits the batch central limit theorem and the sharp first-order risk constant. Our key observation is that the empirical optimum evolves on a smooth equilibrium manifold indexed by the running statistic. An $m$-th order equilibrium-jet predictor combined with an order-$\nu$ frozen corrector yields localized tracking rates $O(T^{-\nu(m+1)})$. We formalize EM-compressibility and EM-jet$^R$-compressibility as the structural conditions that make the equilibrium response and the Newton corrector evaluable from a retained streaming statistic. The theory is instantiated in latent linear Gaussian covariance estimation, where the first-order scheme operates on a compressed $d \times d$ statistic with explicit finite-sample risk envelopes and a certified restart rule.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's main contribution is a conditional transfer theorem letting online EM inherit batch CLTs and risk constants if tracking error to the moving empirical optimum stays o(T^{-1/2}), achieved via m-th order jet predictors under EM-compressibility.

read the letter

I read the paper on higher-order equilibrium tracking for EM-compressible online estimation. The central idea is that you can transfer batch asymptotics to the online setting if you keep the tracking error small enough, specifically o(T^{-1/2}) in L2 norm. They achieve this with higher-order jet predictors on the equilibrium manifold under compressibility conditions that let everything run off a retained streaming statistic.

Referee Report

2 major / 2 minor

Summary. The paper recasts online estimation for latent-variable models as tracking a moving empirical equilibrium on a smooth manifold indexed by the running statistic. It proves a conditional batch-to-online transfer theorem: if the tracking error satisfies ||e_T||_{L^2} = o(T^{-1/2}), the online estimator inherits the batch CLT and sharp first-order risk constant. An m-th order equilibrium-jet predictor paired with an order-ν frozen corrector is shown to deliver localized tracking rates O(T^{-ν(m+1)}), under the structural assumptions of EM-compressibility and EM-jet^R-compressibility that permit evaluation from a retained streaming statistic. The framework is instantiated for latent linear Gaussian covariance estimation using a compressed d×d statistic, with explicit finite-sample risk envelopes and a certified restart rule.

Significance. If the transfer theorem and rate results hold, the work supplies a systematic design principle for online EM-type algorithms that asymptotically recover batch performance without sacrificing the sharp risk constant. The higher-order jet construction provides explicit rates that can satisfy the o(T^{-1/2}) hypothesis, and the concrete instantiation with compressed statistics, finite-sample bounds, and restart rule offers immediately usable tools. These elements constitute a clear advance over standard stochastic-approximation analyses that focus only on population convergence.

major comments (2)

[Abstract / Transfer Theorem] Abstract / Transfer Theorem statement: the central claim that the online estimator inherits the batch CLT and sharp risk constant rests on the hypothesis ||e_T||_{L^2} = o(T^{-1/2}). The manuscript does not supply explicit error-bar derivations or a verification that the O(T^{-ν(m+1)}) rate achieved by the m-th order jet predictor and ν-order corrector meets this condition for the free parameters m and ν without additional post-hoc tuning. This hypothesis is load-bearing for the transfer result.
[Instantiation section] Instantiation section (latent linear Gaussian covariance estimation): while finite-sample risk envelopes and a restart rule are provided, the section does not include a direct check (analytic or numerical) that the realized tracking error ||e_T||_{L^2} is indeed o(T^{-1/2}) under the chosen compressibility conditions and for representative values of m and ν. Without this, the applicability of the transfer theorem to the concrete estimator remains unconfirmed.

minor comments (2)

[Definitions] The notation EM-jet^R-compressibility is introduced without an accompanying equation that explicitly shows how the Newton corrector is recovered from the retained statistic; adding a displayed equation would improve readability.
[Theory section] A short table summarizing the dependence of the tracking rate on the pair (m, ν) and the minimal values needed to satisfy o(T^{-1/2}) would help readers quickly assess parameter choices.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and for highlighting the load-bearing role of the tracking-error hypothesis in the transfer theorem. We address both major comments below and will revise the manuscript accordingly to strengthen the presentation.

read point-by-point responses

Referee: Abstract / Transfer Theorem statement: the central claim that the online estimator inherits the batch CLT and sharp risk constant rests on the hypothesis ||e_T||_{L^2} = o(T^{-1/2}). The manuscript does not supply explicit error-bar derivations or a verification that the O(T^{-ν(m+1)}) rate achieved by the m-th order jet predictor and ν-order corrector meets this condition for the free parameters m and ν without additional post-hoc tuning. This hypothesis is load-bearing for the transfer result.

Authors: We agree that the o(T^{-1/2}) condition is essential for the batch-to-online transfer. The manuscript already establishes the localized tracking rate O(T^{-ν(m+1)}) under EM-compressibility and EM-jet^R-compressibility. Because m and ν are user-chosen integers (m ≥ 0, ν ≥ 1), any choice satisfying ν(m+1) > 1/2 automatically yields the required o(T^{-1/2}) rate; standard selections such as m=1, ν=1 give O(T^{-2}), which is strictly faster. In the revision we will add an explicit corollary stating the minimal parameter condition ν(m+1) > 1/2 together with the corresponding error-bar derivation that converts the big-O rate into the little-o statement, thereby removing any need for post-hoc tuning. revision: yes
Referee: Instantiation section (latent linear Gaussian covariance estimation): while finite-sample risk envelopes and a restart rule are provided, the section does not include a direct check (analytic or numerical) that the realized tracking error ||e_T||_{L^2} is indeed o(T^{-1/2}) under the chosen compressibility conditions and for representative values of m and ν. Without this, the applicability of the transfer theorem to the concrete estimator remains unconfirmed.

Authors: We concur that an explicit verification in the instantiation would confirm applicability. The first-order scheme in the covariance example corresponds to m=0, ν=1, producing the rate O(T^{-1}), which is already o(T^{-1/2}). We will insert a short analytic paragraph deriving the L^2 tracking error bound from the general rate under the compressed d×d statistic and the EM-compressibility conditions, together with a brief numerical illustration for moderate d that plots the empirical ||e_T||_{L^2} decay. This addition will directly link the concrete estimator to the transfer hypothesis. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's central result is a conditional batch-to-online transfer theorem: under the explicit hypothesis ||e_T||_{L^2} = o(T^{-1/2}), the online estimator inherits the batch CLT and risk constant. The m-th order jet predictor plus ν-order frozen corrector is explicitly constructed to deliver the faster rate O(T^{-ν(m+1)}) that satisfies the hypothesis whenever the stated EM-compressibility conditions hold. This is a standard constructive verification of a sufficient condition rather than a reduction of the theorem to its own inputs by definition or fitting. No load-bearing self-citation, ansatz smuggling, or renaming of known results appears in the derivation chain; the argument rests on standard manifold smoothness and stochastic approximation assumptions that remain independent of the paper's fitted quantities or prior self-references.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 2 invented entities

The framework rests on smoothness of the equilibrium manifold and the two compressibility conditions; m and ν are design parameters chosen by the user rather than fitted to data.

free parameters (2)

predictor order m
Chosen by hand to set the order of the equilibrium-jet approximation.
corrector order ν
Chosen by hand to set the order of the frozen corrector.

axioms (2)

domain assumption The empirical optimum evolves on a smooth equilibrium manifold indexed by the running statistic.
Invoked to justify the jet predictor construction.
domain assumption EM-compressibility and EM-jet^R-compressibility hold.
Required for the response and Newton corrector to be evaluable from the retained statistic.

invented entities (2)

equilibrium manifold no independent evidence
purpose: Models the evolution of the batch optimum as a function of the running statistic.
Central modeling device introduced to enable higher-order tracking.
EM-compressibility no independent evidence
purpose: Structural condition allowing the equilibrium response to be computed from a compressed streaming statistic.
New formalization that makes the method memory-efficient.

pith-pipeline@v0.9.0 · 5530 in / 1568 out tokens · 74153 ms · 2026-05-12T01:12:13.977336+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

An m-th order equilibrium-jet predictor combined with an order-ν frozen corrector yields localized tracking rates O(T^{-ν(m+1)}).
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean alpha_pin_under_high_calibration unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We formalize EM-compressibility and EM-jet^R-compressibility as the structural conditions that make the equilibrium response and the Newton corrector evaluable from a retained streaming statistic.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 1 internal anchor

[1]

1998 , isbn =

Asymptotic Statistics , series =. 1998 , isbn =

work page 1998
[2]

and Yin, G

Kushner, Harold J. and Yin, G. George , title =. 2003 , isbn =

work page 2003
[3]

, title =

Borkar, Vivek S. , title =. 2008 , isbn =

work page 2008
[4]

, title =

Nocedal, Jorge and Wright, Stephen J. , title =. 2006 , isbn =

work page 2006
[5]

and Sun, Ji-guang , title =

Stewart, Gilbert W. and Sun, Ji-guang , title =. 1990 , url =

work page 1990
[6]

and Neudecker, Heinz , title =

Magnus, Jan R. and Neudecker, Heinz , title =. 2019 , isbn =

work page 2019
[7]

and Laird, Nan M

Dempster, Arthur P. and Laird, Nan M. and Rubin, Donald B. , title =. Journal of the Royal Statistical Society: Series B (Methodological) , volume =. 1977 , url =

work page 1977
[8]

and Thayer, Dorothy T

Rubin, Donald B. and Thayer, Dorothy T. , title =. Psychometrika , volume =. 1982 , url =

work page 1982
[9]

and Bishop, Christopher M

Tipping, Michael E. and Bishop, Christopher M. , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. 1999 , url =

work page 1999
[10]

On-line Expectation--Maximization Algorithm for Latent Data Models , journal =

Capp. On-line Expectation--Maximization Algorithm for Latent Data Models , journal =. 2009 , url =

work page 2009
[11]

and Eisenstat, Stanley C

Dembo, Ron S. and Eisenstat, Stanley C. and Steihaug, Trond , title =. SIAM Journal on Numerical Analysis , volume =. 1982 , url =

work page 1982
[12]

and Walker, Homer F

Eisenstat, Stanley C. and Walker, Homer F. , title =. SIAM Journal on Scientific Computing , volume =. 1996 , url =

work page 1996
[13]

, title =

Pearlmutter, Barak A. , title =. Neural Computation , volume =. 1994 , url =

work page 1994
[14]

and Juditsky, Anatoli B

Polyak, Boris T. and Juditsky, Anatoli B. , title =. SIAM Journal on Control and Optimization , volume =. 1992 , url =

work page 1992
[15]

IEEE Transactions on Signal Processing , volume =

Simonetto, Andrea and Mokhtari, Aryan and Koppel, Alec and Leus, Geert and Ribeiro, Alejandro , title =. IEEE Transactions on Signal Processing , volume =. 2016 , url =

work page 2016
[16]

arXiv preprint arXiv:1602.07527 , year =

Murray, Iain , title =. arXiv preprint arXiv:1602.07527 , year =. 1602.07527 , archivePrefix =

work page arXiv
[17]

, title =

Roosta-Khorasani, Farbod and Mahoney, Michael W. , title =. Mathematical Programming , volume =. 2019 , url =

work page 2019
[18]

, title =

Pilanci, Mert and Wainwright, Martin J. , title =. SIAM Journal on Optimization , volume =. 2017 , url =

work page 2017
[19]

Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =

Fan, Jianqing and Liu, Han and Ning, Yang and Zou, Hui , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. 2017 , url =

work page 2017
[20]

, title =

Toulis, Panos and Airoldi, Edoardo M. , title =. The Annals of Statistics , volume =. 2017 , url =

work page 2017
[21]

Journal of Machine Learning Research , volume =

Fang, Yixin and Xu, Jinfeng and Yang, Lei , title =. Journal of Machine Learning Research , volume =. 2018 , url =

work page 2018
[22]

and Nocedal, Jorge , title =

Bollapragada, Raghu and Byrd, Richard H. and Nocedal, Jorge , title =. IMA Journal of Numerical Analysis , volume =. 2019 , url =

work page 2019
[23]

and Tong, Xin T

Chen, Xi and Lee, Jason D. and Tong, Xin T. and Zhang, Yichen , title =. The Annals of Statistics , volume =. 2020 , url =

work page 2020
[24]

The Annals of Statistics , volume =

Fan, Jianqing and Liao, Yuan and Mincheva, Martina , title =. The Annals of Statistics , volume =. 2011 , url =

work page 2011
[25]

SIAM Journal on Optimization , volume =

Liu, Yang and Roosta, Fred , title =. SIAM Journal on Optimization , volume =. 2021 , url =

work page 2021
[26]

Journal of Machine Learning Research , volume =

Zhu, Wanrong and Lou, Zhipeng and Wu, Wei Biao , title =. Journal of Machine Learning Research , volume =. 2022 , url =

work page 2022
[27]

Proceedings of the AAAI Conference on Artificial Intelligence , volume =

Lee, Sokbae and Liao, Yuan and Seo, Myung Hwan and Shin, Youngki , title =. Proceedings of the AAAI Conference on Artificial Intelligence , volume =. 2022 , url =

work page 2022
[28]

Advances in Neural Information Processing Systems 35 , pages =

Xie, Chuhan and Zhang, Zhihua , title =. Advances in Neural Information Processing Systems 35 , pages =. 2022 , url =

work page 2022
[29]

Journal of the American Statistical Association , volume =

Zhu, Wanrong and Chen, Xi and Wu, Wei Biao , title =. Journal of the American Statistical Association , volume =. 2023 , url =

work page 2023
[30]

Proceedings of The 26th International Conference on Artificial Intelligence and Statistics , series =

Chee, Jerry and Kim, Hwanwoo and Toulis, Panos , title =. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics , series =. 2023 , url =

work page 2023
[31]

arXiv preprint arXiv:2308.01481 , year =

Roy, Abhishek and Balasubramanian, Krishnakumar , title =. arXiv preprint arXiv:2308.01481 , year =. 2308.01481 , archivePrefix =

work page arXiv
[32]

Journal of the American Statistical Association , volume =

Chen, Xi and Lai, Zehua and Li, He and Zhang, Yichen , title =. Journal of the American Statistical Association , volume =. 2024 , url =

work page 2024
[33]

, title =

Borkar, Vivek S. , title =. Stochastic Processes and their Applications , volume =. 2025 , url =

work page 2025
[34]

Proceedings of Thirty Eighth Conference on Learning Theory , series =

Jiang, Liwei and Roy, Abhishek and Balasubramanian, Krishna and Davis, Damek and Drusvyatskiy, Dmitriy and Na, Sen , title =. Proceedings of Thirty Eighth Conference on Learning Theory , series =. 2025 , url =

work page 2025
[35]

, title =

Na, Sen and Mahoney, Michael W. , title =. Journal of Machine Learning Research , volume =. 2025 , url =

work page 2025
[36]

Online Covariance Matrix Estimation in Sketched Newton Methods

Kuang, Wei and Anitescu, Mihai and Na, Sen , title =. arXiv preprint arXiv:2502.07114 , year =. 2502.07114 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv
[37]

Statistical Inference for Linear Stochastic Approximation with Markovian Noise , journal =

Samsonov, Sergey and Sheshukova, Marina and Moulines,. Statistical Inference for Linear Stochastic Approximation with Markovian Noise , journal =. 2025 , eprint =

work page 2025
[38]

arXiv preprint arXiv:2510.20996 , year =

Chen, Xiaohong and Kim, Min Seong and Lee, Sokbae and Seo, Myung Hwan and Song, Myunghyun , title =. arXiv preprint arXiv:2510.20996 , year =. 2510.20996 , archivePrefix =

work page arXiv
[39]

Wu, C. F. Jeff , title =. The Annals of Statistics , volume =. 1983 , url =

work page 1983
[40]

and Hinton, Geoffrey E

Neal, Radford M. and Hinton, Geoffrey E. , title =. Learning in Graphical Models , editor =. 1998 , url =

work page 1998
[41]

and Krishnan, Thriyambakam , title =

McLachlan, Geoffrey J. and Krishnan, Thriyambakam , title =. 2008 , isbn =

work page 2008
[42]

Convergence of a Stochastic Approximation Version of the

Delyon, Bernard and Lavielle, Marc and Moulines,. Convergence of a Stochastic Approximation Version of the. The Annals of Statistics , volume =. 1999 , url =

work page 1999
[43]

The Annals of Mathematical Statistics , volume =

Robbins, Herbert and Monro, Sutton , title =. The Annals of Mathematical Statistics , volume =. 1951 , url =

work page 1951
[44]

Adaptive Algorithms and Stochastic Approximations , series =

Benveniste, Albert and M. Adaptive Algorithms and Stochastic Approximations , series =. 1990 , isbn =

work page 1990
[45]

, title =

Moulines, Eric and Bach, Francis R. , title =. Advances in Neural Information Processing Systems 24 , editor =. 2011 , url =

work page 2011
[46]

and Stoffer, David S

Shumway, Robert H. and Stoffer, David S. , title =. Journal of Time Series Analysis , volume =. 1982 , url =

work page 1982
[47]

Neural Computation , volume =

Roweis, Sam and Ghahramani, Zoubin , title =. Neural Computation , volume =. 1999 , url =

work page 1999
[48]

and Mahony, Robert and Sepulchre, Rodolphe , title =

Absil, P.-A. and Mahony, Robert and Sepulchre, Rodolphe , title =. 2008 , isbn =

work page 2008
[49]

2007 , isbn =

Bhatia, Rajendra , title =. 2007 , isbn =

work page 2007
[50]

2003 , isbn =

Saad, Yousef , title =. 2003 , isbn =

work page 2003
[51]

, title =

Tropp, Joel A. , title =. Foundations of Computational Mathematics , volume =. 2012 , url =

work page 2012
[52]

2018 , isbn =

Vershynin, Roman , title =. 2018 , isbn =

work page 2018
[53]

, title =

Wainwright, Martin J. , title =. 2019 , isbn =

work page 2019
[54]

Optimization Methods for Large-Scale Machine Learning , journal =

Bottou, L. Optimization Methods for Large-Scale Machine Learning , journal =. 2018 , url =

work page 2018
[55]

and Rockafellar, R

Dontchev, Asen L. and Rockafellar, R. Tyrrell , title =. 2014 , isbn =

work page 2014
[56]

, title =

Louis, Thomas A. , title =. Journal of the Royal Statistical Society: Series B (Methodological) , volume =. 1982 , url =

work page 1982
[57]

Time-Varying Convex Optimization: Time-Structured Algorithms and Applications , journal =

Simonetto, Andrea and. Time-Varying Convex Optimization: Time-Structured Algorithms and Applications , journal =. 2020 , url =

work page 2020
[58]

2003 , isbn =

Teufel, Stefan , title =. 2003 , isbn =

work page 2003
[59]

, title =

Isserlis, L. , title =. Biometrika , volume =. 1918 , url =

work page 1918
[60]

, title =

Muirhead, Robb J. , title =. 1982 , url =

work page 1982