Nonparametric Sparse Online Learning of the Koopman Operator

Alec Koppel; Boya Hou; Nathan Dahlin; Sina Sanjari; Subhonmesh Bose

arxiv: 2405.07432 · v4 · submitted 2024-05-13 · 📊 stat.ML · cs.LG· cs.SY· eess.SY

Nonparametric Sparse Online Learning of the Koopman Operator

Boya Hou , Sina Sanjari , Nathan Dahlin , Alec Koppel , Subhonmesh Bose This is my paper

Pith reviewed 2026-05-24 01:09 UTC · model grok-4.3

classification 📊 stat.ML cs.LGcs.SYeess.SY

keywords Koopman operatoronline learningstochastic approximationreproducing kernel Hilbert spaceconditional mean embeddingssparse nonparametric learningdynamical systems

0 comments

The pith

A sparse online algorithm learns the Koopman operator iteratively from sequential data with convergence guarantees even when the chosen function space is misspecified.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a nonparametric method that updates an approximation to the Koopman operator on the fly as trajectory data arrive one step at a time. It works inside a reproducing kernel Hilbert space, imposes sparsity to keep the model size bounded, and still supplies both asymptotic and finite-time error bounds. When the true dynamics lie outside the chosen space, the algorithm is shown to converge to the conditional-mean-embedding operator instead, preserving the same guarantees under trajectory-based sampling.

Core claim

The Koopman operator, viewed as an operator on an RKHS, can be learned online by stochastic approximation; the resulting sparse iterates converge to the true operator (or to the related CME operator in the misspecified case) with explicit rates that hold for data arriving sequentially along system trajectories.

What carries the argument

Stochastic-approximation update of a sparse kernel representation of the Koopman operator, related to the conditional mean embedding operator when the RKHS is misspecified.

If this is right

The learned operator can be used for prediction and control on streaming data without storing the full history.
Model complexity stays bounded because sparsity is enforced at each step.
Both asymptotic consistency and non-asymptotic error bounds hold under the same sampling regime.
The same guarantees apply when the RKHS cannot represent the dynamics exactly, via the CME connection.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be paired with model-predictive control loops that update the predictor in real time.
Similar online sparse schemes might apply to learning other integral operators from sequential observations.
The finite-time bounds could be used to set explicit stopping criteria for online deployment.

Load-bearing premise

That data generated by the unknown dynamics can be treated as valid trajectory samples for the online updates, and that the Koopman operator remains well-defined as an operator on the chosen RKHS even when the dynamics escape that space.

What would settle it

A concrete dynamical system whose true Koopman operator, when approximated by the algorithm on fresh trajectories, produces prediction error that does not shrink at the stated rate.

Figures

Figures reproduced from arXiv: 2405.07432 by Alec Koppel, Boya Hou, Nathan Dahlin, Sina Sanjari, Subhonmesh Bose.

**Figure 2.** Figure 2: (a) Potential landscape and one trajectory of the Langevin dynamics; (b),(c),(d) [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: (a) Two trajectories of the Duffing oscillator that converge to two different [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗

**Figure 4.** Figure 4: (Top) White dots, bold bars, and whiskers give median, 95% confidence [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

**Figure 5.** Figure 5: Normalized value functions with an increasing number of iterations. [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗

read the original abstract

The Koopman operator provides a powerful framework for representing the dynamics of general nonlinear dynamical systems. However, existing data-driven approaches to learning the Koopman operator rely on batch data. In this work, we present a sparse online learning algorithm that learns the Koopman operator iteratively via stochastic approximation, with explicit control over model complexity and provable convergence guarantees. Specifically, we study the Koopman operator via its action on the reproducing kernel Hilbert space (RKHS), and address the mis-specified scenario where the dynamics may escape the chosen RKHS. In this mis-specified setting, we relate the Koopman operator to the conditional mean embeddings (CME) operator. We further establish both asymptotic and finite-time convergence guarantees for our learning algorithm in mis-specified setting, with trajectory-based sampling where the data arrive sequentially over time. Numerical experiments demonstrate the algorithm's capability to learn unknown nonlinear dynamics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a sparse online stochastic approximation algorithm for the Koopman operator in RKHS with a CME link for misspecification and claimed finite-time convergence on trajectory data.

read the letter

The main takeaway is that the work moves Koopman learning from batch to an online sparse setting with explicit complexity control and convergence claims that cover the misspecified case by relating the operator to conditional mean embeddings. That combination of features is the concrete advance over the batch methods referenced in the abstract. The algorithm description and the CME connection are handled cleanly enough to make the misspecification handling look workable on paper. The finite-time bounds under sequential trajectory sampling are the part that needs scrutiny. Trajectory samples are serially dependent, and standard stochastic approximation finite-time arguments usually require independence or explicit mixing or blocking to control the dependence; the abstract states the sampling regime but gives no sign those conditions were inserted. If the proofs skip that step, the stated rates do not apply to the claimed setting. Experiments are mentioned only at the level of “demonstrate capability,” with no system details or baseline comparisons supplied, so their weight is hard to judge. This is aimed at researchers already working on kernel methods and data-driven dynamics who want an online variant with some theory attached. A reader who needs the finite-time result to be airtight will have to check the derivations closely. The paper is coherent on its own terms and makes specific, checkable claims, so it deserves a serious referee even if the dependence issue forces revisions.

Referee Report

2 major / 2 minor

Summary. The paper proposes a sparse online learning algorithm for the Koopman operator that operates iteratively via stochastic approximation in an RKHS, with explicit sparsity control. It addresses the misspecified case by relating the Koopman operator to the conditional mean embedding (CME) operator and claims both asymptotic and finite-time convergence guarantees under sequential trajectory-based sampling. Numerical experiments are included to illustrate learning of nonlinear dynamics.

Significance. If the finite-time bounds are valid under the stated trajectory sampling, the work would provide a useful nonparametric online method with complexity control for Koopman learning, extending batch methods and connecting to CME theory in the misspecified regime. The combination of sparsity, online updates, and explicit rates is a potential strength, though the dependence issue in sampling must be resolved for the guarantees to apply.

major comments (2)

[finite-time convergence theorem / §5] The finite-time convergence result (presumably in the theory section following the algorithm definition) applies standard stochastic approximation rates without inserting a mixing coefficient, blocking argument, or ergodicity condition to handle serial dependence in trajectory samples. Standard Robbins-Monro or similar analyses require martingale-difference or independent noise; trajectory data from a dynamical system violates this, so the stated rates do not directly apply to the claimed sampling regime.
[misspecified setting / relation to CME] The relation between the Koopman operator and the CME operator in the misspecified setting (abstract and corresponding theoretical section) is asserted but the precise sense in which the projection or embedding holds when the true dynamics escape the RKHS is not load-bearing without an explicit error decomposition or assumption on the residual; this underpins the misspecified guarantees.

minor comments (2)

[algorithm definition] Notation for the sparse projection or regularization parameter should be introduced earlier and used consistently across algorithm and analysis sections.
[experiments] The numerical experiments section would benefit from explicit specification of the RKHS kernel, trajectory length, and comparison baselines to allow reproduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and constructive comments on our manuscript. We address each major comment below and commit to revisions that strengthen the theoretical analysis without altering the core contributions.

read point-by-point responses

Referee: [finite-time convergence theorem / §5] The finite-time convergence result (presumably in the theory section following the algorithm definition) applies standard stochastic approximation rates without inserting a mixing coefficient, blocking argument, or ergodicity condition to handle serial dependence in trajectory samples. Standard Robbins-Monro or similar analyses require martingale-difference or independent noise; trajectory data from a dynamical system violates this, so the stated rates do not directly apply to the claimed sampling regime.

Authors: We agree that the current finite-time analysis relies on standard stochastic approximation assumptions that do not explicitly address serial dependence under trajectory sampling. In the revised manuscript we will add an ergodicity assumption on the underlying dynamical system and incorporate a mixing coefficient into the finite-time bounds (via a blocking argument where appropriate) so that the stated rates apply rigorously to the sequential trajectory regime. revision: yes
Referee: [misspecified setting / relation to CME] The relation between the Koopman operator and the CME operator in the misspecified setting (abstract and corresponding theoretical section) is asserted but the precise sense in which the projection or embedding holds when the true dynamics escape the RKHS is not load-bearing without an explicit error decomposition or assumption on the residual; this underpins the misspecified guarantees.

Authors: We thank the referee for noting this gap. While the manuscript already connects the Koopman operator to the CME via conditional expectation, the treatment of the misspecified case will be strengthened by inserting an explicit error decomposition that isolates the residual term arising when the true dynamics lie outside the chosen RKHS; this will make the misspecified convergence guarantees fully rigorous. revision: yes

Circularity Check

0 steps flagged

No circularity: convergence claims rest on standard stochastic approximation applied to operator learning

full rationale

The paper derives an online sparse algorithm for the Koopman operator via stochastic approximation in an RKHS, relates it to the CME operator under mis-specification, and states asymptotic plus finite-time convergence guarantees for trajectory sampling. These guarantees are presented as following from the algorithm's iterative updates rather than from any self-definition, fitted parameter renamed as prediction, or load-bearing self-citation chain. No equation or claim reduces the target result to an input quantity by algebraic identity or by construction; the derivation chain remains independent of the claimed outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract alone supplies insufficient detail to enumerate free parameters, axioms, or invented entities; full text would be required to identify items such as kernel hyperparameters, sparsity thresholds, or background assumptions on sampling.

pith-pipeline@v0.9.0 · 5694 in / 1184 out tokens · 58684 ms · 2026-05-24T01:09:05.085423+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages · 3 internal anchors

[1]

Our algorithm builds on stochastic operator gradient descent (SOGD) for U to solve the regression problem in (3.4)

Sparse Online Learning Algorithm.Having established that K is the adjoint of U in Theorem 4.1, we next present an online algorithm to construct K iteratively. Our algorithm builds on stochastic operator gradient descent (SOGD) for U to solve the regression problem in (3.4). Our framework defines a sharp deviation from prior art that uses sample average ap...

work page 2000
[2]

Following Section 4, we make the following assumption on the regularity ofK, which encodes the regularity of the Markov kernel

Convergence Analysis with Trajectory-Based Sampling.We now pres- ent our theoretical results on the convergence behavior of the sparse SOGD algorithm proposed in Section 5. Following Section 4, we make the following assumption on the regularity ofK, which encodes the regularity of the Markov kernel. Assumption2.K∈HS (H,[H]β)for some β∈(0, 1)and ∥K∥HS(H,[H...

work page
[3]

Applications. 7.1. Analyzing Unknown Nonlinear Dynamics.Consider the unforced Duffing oscillator, described by ¨z=−δ˙z−z ( β+αz2) , with δ= 0.5, β=−1, and α= 1, where z∈Rand ˙z∈Rare the scalar position and velocity. Let x = (z,˙z), as shown in Figure 3(a), the Duffing dynamics exhibits two ROAs, corresponding to stable equilibrium points at x = (−1, 0) an...

work page
[4]

As the dataset size increases, the estimated value functions capture important features of the reference such as the high-value diagonal passing through the stationary, upright pendulum position

work page
[5]

Our method does not require the RKHS to be closed under the Markov kernel

Discussions and Conclusions.In this paper, we present an online learning algorithm that learns a sparse Koopman operator in RKHS with sampling from trajectories. Our method does not require the RKHS to be closed under the Markov kernel. We established the asymptotic and finite-time convergence guarantee of the online sparse SOGD algorithm. The computation...

work page 2011
[6]

Bach and E

F. Bach and E. Moulines,Non-strongly-convex smooth stochastic approxima- tion with convergence rate o (1/n), Advances in neural information processing systems, 26 (2013)

work page 2013
[7]

Berlinet and C

A. Berlinet and C. Thomas-Agnan,Reproducing kernel Hilbert spaces in probability and statistics, Springer Science & Business Media, 2011

work page 2011
[8]

Bevanda, M

P. Bevanda, M. Beier, A. Lederer, S. Sosnowski, E. H ¨ullermeier, and S. Hirche,Koopman kernel regression, Advances in Neural Information Processing Systems, 36 (2023), pp. 16207–16221

work page 2023
[9]

L. Bold, F. M. Philipp, M. Schaller, and K. Worthmann,Kernel-based koopman approximants for control: Flexible sampling, error analysis, and stability, arXiv preprint arXiv:2412.02811, (2024)

work page arXiv 2024
[10]

V. S. Borkar and V. S. Borkar,Stochastic approximation: a dynamical systems viewpoint, vol. 9, Springer, 2008. [7]G. Brockman,Openai gym, arXiv preprint arXiv:1606.01540, (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2008
[11]

S. L. Brunton, J. L. Proctor, and J. N. Kutz,Discovering governing equations from data by sparse identification of nonlinear dynamical systems, Proceedings of the National Academy of Sciences, 113 (2016), pp. 3932–3937

work page 2016
[12]

Budisi´c, R

M. Budisi´c, R. Mohr, and I. Mezi ´c,Applied koopmanism, Chaos: An Inter- disciplinary Journal of Nonlinear Science, 22 (2012), p. 047510

work page 2012
[13]

Z. Chen, S. Zhang, T. T. Doan, J.-P. Clarke, and S. T. Maguluri, Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning, Automatica, 146 (2022), p. 110623. 42B. HOU, S. SANJARI, N. DAHLIN, A. KOPPEL AND B. SUBHONMESH

work page 2022
[14]

Ciliberto, L

C. Ciliberto, L. Rosasco, and A. Rudi,A consistent regularization approach for structured prediction, Advances in neural information processing systems, 29 (2016)

work page 2016
[15]

M. J. Colbrook, I. Mezi ´c, and A. Stepanenko,Limits and powers of koopman learning, arXiv preprint arXiv:2407.06312, (2024)

work page arXiv 2024
[16]

Dinculeanu,Vector integration and stochastic integration in Banach spaces, vol

N. Dinculeanu,Vector integration and stochastic integration in Banach spaces, vol. 48, John Wiley & Sons, 2000

work page 2000
[17]

Fischer and I

S. Fischer and I. Steinwart,Sobolev norm learning rates for regularized least-squares algorithms, The Journal of Machine Learning Research, 21 (2020), pp. 8464–8501

work page 2020
[18]

Giannakis, A

D. Giannakis, A. Henriksen, J. A. Tropp, and R. Ward,Learning to fore- cast dynamical systems from streaming data, SIAM Journal on Applied Dynamical Systems, 22 (2023), pp. 527–558

work page 2023
[19]

Gr¨unew¨alder, G

S. Gr¨unew¨alder, G. Lever, L. Baldassarre, S. Patterson, A. Gret- ton, and M. Pontil,Conditional mean embeddings as regressors, International Conference on Machine Learning, (2012)

work page 2012
[20]

Grunewalder, G

S. Grunewalder, G. Lever, L. Baldassarre, M. Pontil, and A. Gret- ton,Modelling transition dynamics in MDPs with RKHS embeddings, Interna- tional Conference on Machine Learning, (2012)

work page 2012
[21]

R. A. Horn and C. R. Johnson,Topics in matrix analysis, Cambridge univer- sity press, 1994

work page 1994
[22]

B. Hou, A. R. R. Matavalam, S. Bose, and U. Vaidya,Propagating uncertainty through system dynamics in reproducing kernel hilbert space, Physica D: Nonlinear Phenomena, (2024), p. 134168

work page 2024
[23]

B. Hou, S. Sanjari, N. Dahlin, and S. Bose,Compressed decentralized learning of conditional mean embedding operators in reproducing kernel hilbert spaces, Proceedings of the AAAI Conference on Artificial Intelligence, (2023)

work page 2023
[24]

B. Hou, S. Sanjari, N. Dahlin, S. Bose, and U. Vaidya,Sparse learning of dynamical systems in RKHS: An operator-theoretic approach, in International Conference on Machine Learning, PMLR, 2023, pp. 13325–13352

work page 2023
[25]

B. Hou, S. Sanjari, N. Dahlin, A. Koppel, and S. Bose,Nonparametric sparse online learning of the koopman operator, arXiv preprint arXiv:2405.07432, (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[26]

M. R. Jovanovi´c, P. J. Schmid, and J. W. Nichols,Sparsity-promoting dynamic mode decomposition, Physics of Fluids, 26 (2014)

work page 2014
[27]

Kato,Perturbation theory for linear operators, vol

T. Kato,Perturbation theory for linear operators, vol. 132, Springer Science & Business Media, 2013

work page 2013
[28]

Kawahara,Dynamic mode decomposition with reproducing kernels for koop- man spectral analysis, Advances in neural information processing systems, 29 (2016)

Y. Kawahara,Dynamic mode decomposition with reproducing kernels for koop- man spectral analysis, Advances in neural information processing systems, 29 (2016)

work page 2016
[29]

S. Klus, F. N¨uske, and B. Hamzi,Kernel-based approximation of the koopman generator and schr¨ odinger operator, Entropy, 22 (2020), p. 722

work page 2020
[30]

S. Klus, I. Schuster, and K. Muandet,Eigendecompositions of transfer operators in reproducing kernel hilbert spaces, Journal of Nonlinear Science, 30 (2020), pp. 283–315

work page 2020
[31]

Towards tensor-based methods for the numerical approximation of the Perron-Frobenius and Koopman operator

S. Klus and C. Schutte,Towards tensor-based methods for the numerical approximation of the perron-frobenius and koopman operator. arxiv e-prints, arXiv preprint arXiv:1512.06527, (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[32]

K¨ohne, F

F. K¨ohne, F. M. Philipp, M. Schaller, A. Schiela, and K. Worthmann, -error bounds for approximations of the koopman operator by kernel extended NONPARAMETRIC SPARSE ONLINE LEARNING OF THE KOOPMAN OPERATOR43 dynamic mode decomposition, SIAM journal on applied dynamical systems, 24 (2025), pp. 501–529

work page 2025
[33]

Kontoyiannis and S

I. Kontoyiannis and S. P. Meyn,Geometric ergodicity and the spectral gap of non-reversible markov chains, Probability Theory and Related Fields, 154 (2012), pp. 327–339

work page 2012
[34]

B. O. Koopman and J. v. Neumann,Dynamical systems of continuous spectra, Proceedings of the National Academy of Sciences, 18 (1932), pp. 255–263

work page 1932
[35]

Koppel, G

A. Koppel, G. Warnell, E. Stump, and A. Ribeiro,Parsimonious online learning with kernels via sparse projections in function space, The Journal of Machine Learning Research, 20 (2019), pp. 83–126

work page 2019
[36]

Kostic, K

V. Kostic, K. Lounici, P. Novelli, and M. Pontil,Sharp spectral rates for koopman operator learning, Advances in Neural Information Processing Systems, 36 (2023), pp. 32328–32339

work page 2023
[37]

Kostic, P

V. Kostic, P. Novelli, A. Maurer, C. Ciliberto, L. Rosasco, and M. Pontil,Learning dynamical systems via koopman operator regression in reproducing kernel hilbert spaces, Advances in Neural Information Processing Systems, 35 (2022), pp. 4017–4031

work page 2022
[38]

Lasota and M

A. Lasota and M. C. Mackey,Chaos, fractals, and noise: stochastic aspects of dynamics, vol. 97, Springer Science & Business Media, 2013

work page 2013
[39]

D. A. Levin and Y. Peres,Markov chains and mixing times, vol. 107, American Mathematical Soc., 2017

work page 2017
[40]

Z. Li, D. Meunier, M. Mollenhauer, and A. Gretton,Optimal rates for regularized conditional mean embedding learning, Advances in Neural Information Processing Systems, 35 (2022), pp. 4433–4445

work page 2022
[41]

D. G. Luenberger,Optimization by vector space methods, John Wiley & Sons, 1997

work page 1997
[42]

A. R. Matavalam, B. Hou, H. Choi, S. Bose, and U. Vaidya,Data-driven transient stability analysis using the koopman operator, International Journal of Electrical Power & Energy Systems, 162 (2024), p. 110307

work page 2024
[43]

S. P. Meyn and R. L. Tweedie,Markov chains and stochastic stability, Springer Science & Business Media, 2012

work page 2012
[44]

Mezi´c,Spectral properties of dynamical systems, model reduction and decom- positions, Nonlinear Dynamics, 41 (2005), pp

I. Mezi´c,Spectral properties of dynamical systems, model reduction and decom- positions, Nonlinear Dynamics, 41 (2005), pp. 309–325

work page 2005
[45]

Mezi´c,Spectrum of the koopman operator, spectral expansions in functional spaces, and state-space geometry, Journal of Nonlinear Science, 30 (2020), pp

I. Mezi´c,Spectrum of the koopman operator, spectral expansions in functional spaces, and state-space geometry, Journal of Nonlinear Science, 30 (2020), pp. 2091– 2145

work page 2020
[46]

Mezi ´c,Koopman operator, geometry, and learning of dynamical systems, Notices of the American Mathematical Society, 68 (2021), pp

I. Mezi ´c,Koopman operator, geometry, and learning of dynamical systems, Notices of the American Mathematical Society, 68 (2021), pp. 1087–1105

work page 2021
[47]

Nonparamet- ric approximation of conditional expectation opera- tors

M. Mollenhauer and P. Koltai,Nonparametric approximation of conditional expectation operators, arXiv preprint arXiv:2012.12917, (2020)

work page arXiv 2012
[48]

Nummelin,General irreducible Markov chains and non-negative operators, no

E. Nummelin,General irreducible Markov chains and non-negative operators, no. 83, Cambridge University Press, 2004

work page 2004
[49]

S. E. Otto and C. W. Rowley,Koopman operators for estimation and control of dynamical systems, Annual Review of Control, Robotics, and Autonomous Systems, 4 (2021), pp. 59–87

work page 2021
[50]

Park and K

J. Park and K. Muandet,A measure-theoretic approach to kernel conditional mean embeddings, Advances in Neural Information Processing Systems, 33 (2020), pp. 21247–21259

work page 2020
[51]

Poincar´e,Les m´ ethodes nouvelles de la m´ ecanique c´ eleste, vol

H. Poincar´e,Les m´ ethodes nouvelles de la m´ ecanique c´ eleste, vol. 3, Gauthier- Villars et fils, 1899. 44B. HOU, S. SANJARI, N. DAHLIN, A. KOPPEL AND B. SUBHONMESH

work page
[52]

Robbins and D

H. Robbins and D. Siegmund,A convergence theorem for non negative almost supermartingales and some applications, in Optimizing methods in statistics, Elsevier, 1971, pp. 233–257

work page 1971
[53]

J. A. Rosenfeld, B. P. Russo, R. Kamalapurkar, and T. T. Johnson, The occupation kernel method for nonlinear system identification, SIAM Journal on Control and Optimization, 62 (2024), pp. 1643–1668

work page 2024
[54]

C. W. Rowley, I. Mezi ´c, S. Bagheri, P. Schlatter, and D. S. Henning- son,Spectral analysis of nonlinear flows, Journal of fluid mechanics, 641 (2009), pp. 115–127

work page 2009
[55]

Rudin,Functional Analysis, International series in pure and applied mathematics, McGraw-Hill, 1991, https://books.google.com/books?id=Sh vAAAAMAAJ

W. Rudin,Functional Analysis, International series in pure and applied mathematics, McGraw-Hill, 1991, https://books.google.com/books?id=Sh vAAAAMAAJ

work page 1991
[56]

P. J. Schmid,Dynamic mode decomposition of numerical and experimental data, Journal of fluid mechanics, 656 (2010), pp. 5–28

work page 2010
[57]

Smale and D.-X

S. Smale and D.-X. Zhou,Online learning with markov sampling, Analysis and Applications, 7 (2009), pp. 87–113

work page 2009
[58]

L. Song, J. Huang, A. Smola, and K. Fukumizu,Hilbert space embeddings of conditional distributions with applications to dynamical systems, in Proceedings of the 26th Annual International Conference on Machine Learning, 2009, pp. 961–968

work page 2009
[59]

Srikant and L

R. Srikant and L. Ying,Finite-time error bounds for linear stochastic ap- proximation andtd learning, in Conference on Learning Theory, PMLR, 2019, pp. 2803–2830

work page 2019
[60]

Steinwart and C

I. Steinwart and C. Scovel,Mercer’s theorem on general domains: On the interaction between measures, kernels, and RKHSs, Constructive Approximation, 35 (2012), pp. 363–417. [58]C. Szepesv ´ari,Algorithms for reinforcement learning, Springer nature, 2022

work page 2012
[61]

Tarres and Y

P. Tarres and Y. Yao,Online learning as stochastic approximation of regu- larization paths: Optimality and almost-sure convergence, IEEE Transactions on Information Theory, 60 (2014), pp. 5716–5735

work page 2014
[62]

Vincent and Y

P. Vincent and Y. Bengio,Kernel matching pursuit, Machine learning, 48 (2002), pp. 165–187

work page 2002
[63]

M. O. Williams, I. G. Kevrekidis, and C. W. Rowley,A data–driven approximation of the koopman operator: Extending dynamic mode decomposition, Journal of Nonlinear Science, 25 (2015), pp. 1307–1346

work page 2015
[64]

M. O. Williams, C. W. Rowley, and I. G. Kevrekidis,A kernel-based approach to data-driven koopman spectral analysis, Journal of Computational Dynamics, (2014)

work page 2014
[65]

C. M. Zagabe and A. Mauroy,Uniform global stability of switched nonlinear systems in the koopman operator framework, SIAM Journal on Control and Optimization, 63 (2025), pp. 472–501

work page 2025
[66]

Zhang, C

H. Zhang, C. W. Rowley, E. A. Deem, and L. N. Cattafesta,Online dynamic mode decomposition for time-varying systems, SIAM Journal on Applied Dynamical Systems, 18 (2019), pp. 1586–1609

work page 2019

[1] [1]

Our algorithm builds on stochastic operator gradient descent (SOGD) for U to solve the regression problem in (3.4)

Sparse Online Learning Algorithm.Having established that K is the adjoint of U in Theorem 4.1, we next present an online algorithm to construct K iteratively. Our algorithm builds on stochastic operator gradient descent (SOGD) for U to solve the regression problem in (3.4). Our framework defines a sharp deviation from prior art that uses sample average ap...

work page 2000

[2] [2]

Following Section 4, we make the following assumption on the regularity ofK, which encodes the regularity of the Markov kernel

Convergence Analysis with Trajectory-Based Sampling.We now pres- ent our theoretical results on the convergence behavior of the sparse SOGD algorithm proposed in Section 5. Following Section 4, we make the following assumption on the regularity ofK, which encodes the regularity of the Markov kernel. Assumption2.K∈HS (H,[H]β)for some β∈(0, 1)and ∥K∥HS(H,[H...

work page

[3] [3]

Applications. 7.1. Analyzing Unknown Nonlinear Dynamics.Consider the unforced Duffing oscillator, described by ¨z=−δ˙z−z ( β+αz2) , with δ= 0.5, β=−1, and α= 1, where z∈Rand ˙z∈Rare the scalar position and velocity. Let x = (z,˙z), as shown in Figure 3(a), the Duffing dynamics exhibits two ROAs, corresponding to stable equilibrium points at x = (−1, 0) an...

work page

[4] [4]

As the dataset size increases, the estimated value functions capture important features of the reference such as the high-value diagonal passing through the stationary, upright pendulum position

work page

[5] [5]

Our method does not require the RKHS to be closed under the Markov kernel

Discussions and Conclusions.In this paper, we present an online learning algorithm that learns a sparse Koopman operator in RKHS with sampling from trajectories. Our method does not require the RKHS to be closed under the Markov kernel. We established the asymptotic and finite-time convergence guarantee of the online sparse SOGD algorithm. The computation...

work page 2011

[6] [6]

Bach and E

F. Bach and E. Moulines,Non-strongly-convex smooth stochastic approxima- tion with convergence rate o (1/n), Advances in neural information processing systems, 26 (2013)

work page 2013

[7] [7]

Berlinet and C

A. Berlinet and C. Thomas-Agnan,Reproducing kernel Hilbert spaces in probability and statistics, Springer Science & Business Media, 2011

work page 2011

[8] [8]

Bevanda, M

P. Bevanda, M. Beier, A. Lederer, S. Sosnowski, E. H ¨ullermeier, and S. Hirche,Koopman kernel regression, Advances in Neural Information Processing Systems, 36 (2023), pp. 16207–16221

work page 2023

[9] [9]

L. Bold, F. M. Philipp, M. Schaller, and K. Worthmann,Kernel-based koopman approximants for control: Flexible sampling, error analysis, and stability, arXiv preprint arXiv:2412.02811, (2024)

work page arXiv 2024

[10] [10]

V. S. Borkar and V. S. Borkar,Stochastic approximation: a dynamical systems viewpoint, vol. 9, Springer, 2008. [7]G. Brockman,Openai gym, arXiv preprint arXiv:1606.01540, (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2008

[11] [11]

S. L. Brunton, J. L. Proctor, and J. N. Kutz,Discovering governing equations from data by sparse identification of nonlinear dynamical systems, Proceedings of the National Academy of Sciences, 113 (2016), pp. 3932–3937

work page 2016

[12] [12]

Budisi´c, R

M. Budisi´c, R. Mohr, and I. Mezi ´c,Applied koopmanism, Chaos: An Inter- disciplinary Journal of Nonlinear Science, 22 (2012), p. 047510

work page 2012

[13] [13]

Z. Chen, S. Zhang, T. T. Doan, J.-P. Clarke, and S. T. Maguluri, Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning, Automatica, 146 (2022), p. 110623. 42B. HOU, S. SANJARI, N. DAHLIN, A. KOPPEL AND B. SUBHONMESH

work page 2022

[14] [14]

Ciliberto, L

C. Ciliberto, L. Rosasco, and A. Rudi,A consistent regularization approach for structured prediction, Advances in neural information processing systems, 29 (2016)

work page 2016

[15] [15]

M. J. Colbrook, I. Mezi ´c, and A. Stepanenko,Limits and powers of koopman learning, arXiv preprint arXiv:2407.06312, (2024)

work page arXiv 2024

[16] [16]

Dinculeanu,Vector integration and stochastic integration in Banach spaces, vol

N. Dinculeanu,Vector integration and stochastic integration in Banach spaces, vol. 48, John Wiley & Sons, 2000

work page 2000

[17] [17]

Fischer and I

S. Fischer and I. Steinwart,Sobolev norm learning rates for regularized least-squares algorithms, The Journal of Machine Learning Research, 21 (2020), pp. 8464–8501

work page 2020

[18] [18]

Giannakis, A

D. Giannakis, A. Henriksen, J. A. Tropp, and R. Ward,Learning to fore- cast dynamical systems from streaming data, SIAM Journal on Applied Dynamical Systems, 22 (2023), pp. 527–558

work page 2023

[19] [19]

Gr¨unew¨alder, G

S. Gr¨unew¨alder, G. Lever, L. Baldassarre, S. Patterson, A. Gret- ton, and M. Pontil,Conditional mean embeddings as regressors, International Conference on Machine Learning, (2012)

work page 2012

[20] [20]

Grunewalder, G

S. Grunewalder, G. Lever, L. Baldassarre, M. Pontil, and A. Gret- ton,Modelling transition dynamics in MDPs with RKHS embeddings, Interna- tional Conference on Machine Learning, (2012)

work page 2012

[21] [21]

R. A. Horn and C. R. Johnson,Topics in matrix analysis, Cambridge univer- sity press, 1994

work page 1994

[22] [22]

B. Hou, A. R. R. Matavalam, S. Bose, and U. Vaidya,Propagating uncertainty through system dynamics in reproducing kernel hilbert space, Physica D: Nonlinear Phenomena, (2024), p. 134168

work page 2024

[23] [23]

B. Hou, S. Sanjari, N. Dahlin, and S. Bose,Compressed decentralized learning of conditional mean embedding operators in reproducing kernel hilbert spaces, Proceedings of the AAAI Conference on Artificial Intelligence, (2023)

work page 2023

[24] [24]

B. Hou, S. Sanjari, N. Dahlin, S. Bose, and U. Vaidya,Sparse learning of dynamical systems in RKHS: An operator-theoretic approach, in International Conference on Machine Learning, PMLR, 2023, pp. 13325–13352

work page 2023

[25] [25]

B. Hou, S. Sanjari, N. Dahlin, A. Koppel, and S. Bose,Nonparametric sparse online learning of the koopman operator, arXiv preprint arXiv:2405.07432, (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[26] [26]

M. R. Jovanovi´c, P. J. Schmid, and J. W. Nichols,Sparsity-promoting dynamic mode decomposition, Physics of Fluids, 26 (2014)

work page 2014

[27] [27]

Kato,Perturbation theory for linear operators, vol

T. Kato,Perturbation theory for linear operators, vol. 132, Springer Science & Business Media, 2013

work page 2013

[28] [28]

Kawahara,Dynamic mode decomposition with reproducing kernels for koop- man spectral analysis, Advances in neural information processing systems, 29 (2016)

Y. Kawahara,Dynamic mode decomposition with reproducing kernels for koop- man spectral analysis, Advances in neural information processing systems, 29 (2016)

work page 2016

[29] [29]

S. Klus, F. N¨uske, and B. Hamzi,Kernel-based approximation of the koopman generator and schr¨ odinger operator, Entropy, 22 (2020), p. 722

work page 2020

[30] [30]

S. Klus, I. Schuster, and K. Muandet,Eigendecompositions of transfer operators in reproducing kernel hilbert spaces, Journal of Nonlinear Science, 30 (2020), pp. 283–315

work page 2020

[31] [31]

Towards tensor-based methods for the numerical approximation of the Perron-Frobenius and Koopman operator

S. Klus and C. Schutte,Towards tensor-based methods for the numerical approximation of the perron-frobenius and koopman operator. arxiv e-prints, arXiv preprint arXiv:1512.06527, (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015

[32] [32]

K¨ohne, F

F. K¨ohne, F. M. Philipp, M. Schaller, A. Schiela, and K. Worthmann, -error bounds for approximations of the koopman operator by kernel extended NONPARAMETRIC SPARSE ONLINE LEARNING OF THE KOOPMAN OPERATOR43 dynamic mode decomposition, SIAM journal on applied dynamical systems, 24 (2025), pp. 501–529

work page 2025

[33] [33]

Kontoyiannis and S

I. Kontoyiannis and S. P. Meyn,Geometric ergodicity and the spectral gap of non-reversible markov chains, Probability Theory and Related Fields, 154 (2012), pp. 327–339

work page 2012

[34] [34]

B. O. Koopman and J. v. Neumann,Dynamical systems of continuous spectra, Proceedings of the National Academy of Sciences, 18 (1932), pp. 255–263

work page 1932

[35] [35]

Koppel, G

A. Koppel, G. Warnell, E. Stump, and A. Ribeiro,Parsimonious online learning with kernels via sparse projections in function space, The Journal of Machine Learning Research, 20 (2019), pp. 83–126

work page 2019

[36] [36]

Kostic, K

V. Kostic, K. Lounici, P. Novelli, and M. Pontil,Sharp spectral rates for koopman operator learning, Advances in Neural Information Processing Systems, 36 (2023), pp. 32328–32339

work page 2023

[37] [37]

Kostic, P

V. Kostic, P. Novelli, A. Maurer, C. Ciliberto, L. Rosasco, and M. Pontil,Learning dynamical systems via koopman operator regression in reproducing kernel hilbert spaces, Advances in Neural Information Processing Systems, 35 (2022), pp. 4017–4031

work page 2022

[38] [38]

Lasota and M

A. Lasota and M. C. Mackey,Chaos, fractals, and noise: stochastic aspects of dynamics, vol. 97, Springer Science & Business Media, 2013

work page 2013

[39] [39]

D. A. Levin and Y. Peres,Markov chains and mixing times, vol. 107, American Mathematical Soc., 2017

work page 2017

[40] [40]

Z. Li, D. Meunier, M. Mollenhauer, and A. Gretton,Optimal rates for regularized conditional mean embedding learning, Advances in Neural Information Processing Systems, 35 (2022), pp. 4433–4445

work page 2022

[41] [41]

D. G. Luenberger,Optimization by vector space methods, John Wiley & Sons, 1997

work page 1997

[42] [42]

A. R. Matavalam, B. Hou, H. Choi, S. Bose, and U. Vaidya,Data-driven transient stability analysis using the koopman operator, International Journal of Electrical Power & Energy Systems, 162 (2024), p. 110307

work page 2024

[43] [43]

S. P. Meyn and R. L. Tweedie,Markov chains and stochastic stability, Springer Science & Business Media, 2012

work page 2012

[44] [44]

Mezi´c,Spectral properties of dynamical systems, model reduction and decom- positions, Nonlinear Dynamics, 41 (2005), pp

I. Mezi´c,Spectral properties of dynamical systems, model reduction and decom- positions, Nonlinear Dynamics, 41 (2005), pp. 309–325

work page 2005

[45] [45]

Mezi´c,Spectrum of the koopman operator, spectral expansions in functional spaces, and state-space geometry, Journal of Nonlinear Science, 30 (2020), pp

I. Mezi´c,Spectrum of the koopman operator, spectral expansions in functional spaces, and state-space geometry, Journal of Nonlinear Science, 30 (2020), pp. 2091– 2145

work page 2020

[46] [46]

Mezi ´c,Koopman operator, geometry, and learning of dynamical systems, Notices of the American Mathematical Society, 68 (2021), pp

I. Mezi ´c,Koopman operator, geometry, and learning of dynamical systems, Notices of the American Mathematical Society, 68 (2021), pp. 1087–1105

work page 2021

[47] [47]

Nonparamet- ric approximation of conditional expectation opera- tors

M. Mollenhauer and P. Koltai,Nonparametric approximation of conditional expectation operators, arXiv preprint arXiv:2012.12917, (2020)

work page arXiv 2012

[48] [48]

Nummelin,General irreducible Markov chains and non-negative operators, no

E. Nummelin,General irreducible Markov chains and non-negative operators, no. 83, Cambridge University Press, 2004

work page 2004

[49] [49]

S. E. Otto and C. W. Rowley,Koopman operators for estimation and control of dynamical systems, Annual Review of Control, Robotics, and Autonomous Systems, 4 (2021), pp. 59–87

work page 2021

[50] [50]

Park and K

J. Park and K. Muandet,A measure-theoretic approach to kernel conditional mean embeddings, Advances in Neural Information Processing Systems, 33 (2020), pp. 21247–21259

work page 2020

[51] [51]

Poincar´e,Les m´ ethodes nouvelles de la m´ ecanique c´ eleste, vol

H. Poincar´e,Les m´ ethodes nouvelles de la m´ ecanique c´ eleste, vol. 3, Gauthier- Villars et fils, 1899. 44B. HOU, S. SANJARI, N. DAHLIN, A. KOPPEL AND B. SUBHONMESH

work page

[52] [52]

Robbins and D

H. Robbins and D. Siegmund,A convergence theorem for non negative almost supermartingales and some applications, in Optimizing methods in statistics, Elsevier, 1971, pp. 233–257

work page 1971

[53] [53]

J. A. Rosenfeld, B. P. Russo, R. Kamalapurkar, and T. T. Johnson, The occupation kernel method for nonlinear system identification, SIAM Journal on Control and Optimization, 62 (2024), pp. 1643–1668

work page 2024

[54] [54]

C. W. Rowley, I. Mezi ´c, S. Bagheri, P. Schlatter, and D. S. Henning- son,Spectral analysis of nonlinear flows, Journal of fluid mechanics, 641 (2009), pp. 115–127

work page 2009

[55] [55]

Rudin,Functional Analysis, International series in pure and applied mathematics, McGraw-Hill, 1991, https://books.google.com/books?id=Sh vAAAAMAAJ

W. Rudin,Functional Analysis, International series in pure and applied mathematics, McGraw-Hill, 1991, https://books.google.com/books?id=Sh vAAAAMAAJ

work page 1991

[56] [56]

P. J. Schmid,Dynamic mode decomposition of numerical and experimental data, Journal of fluid mechanics, 656 (2010), pp. 5–28

work page 2010

[57] [57]

Smale and D.-X

S. Smale and D.-X. Zhou,Online learning with markov sampling, Analysis and Applications, 7 (2009), pp. 87–113

work page 2009

[58] [58]

L. Song, J. Huang, A. Smola, and K. Fukumizu,Hilbert space embeddings of conditional distributions with applications to dynamical systems, in Proceedings of the 26th Annual International Conference on Machine Learning, 2009, pp. 961–968

work page 2009

[59] [59]

Srikant and L

R. Srikant and L. Ying,Finite-time error bounds for linear stochastic ap- proximation andtd learning, in Conference on Learning Theory, PMLR, 2019, pp. 2803–2830

work page 2019

[60] [60]

Steinwart and C

I. Steinwart and C. Scovel,Mercer’s theorem on general domains: On the interaction between measures, kernels, and RKHSs, Constructive Approximation, 35 (2012), pp. 363–417. [58]C. Szepesv ´ari,Algorithms for reinforcement learning, Springer nature, 2022

work page 2012

[61] [61]

Tarres and Y

P. Tarres and Y. Yao,Online learning as stochastic approximation of regu- larization paths: Optimality and almost-sure convergence, IEEE Transactions on Information Theory, 60 (2014), pp. 5716–5735

work page 2014

[62] [62]

Vincent and Y

P. Vincent and Y. Bengio,Kernel matching pursuit, Machine learning, 48 (2002), pp. 165–187

work page 2002

[63] [63]

M. O. Williams, I. G. Kevrekidis, and C. W. Rowley,A data–driven approximation of the koopman operator: Extending dynamic mode decomposition, Journal of Nonlinear Science, 25 (2015), pp. 1307–1346

work page 2015

[64] [64]

M. O. Williams, C. W. Rowley, and I. G. Kevrekidis,A kernel-based approach to data-driven koopman spectral analysis, Journal of Computational Dynamics, (2014)

work page 2014

[65] [65]

C. M. Zagabe and A. Mauroy,Uniform global stability of switched nonlinear systems in the koopman operator framework, SIAM Journal on Control and Optimization, 63 (2025), pp. 472–501

work page 2025

[66] [66]

Zhang, C

H. Zhang, C. W. Rowley, E. A. Deem, and L. N. Cattafesta,Online dynamic mode decomposition for time-varying systems, SIAM Journal on Applied Dynamical Systems, 18 (2019), pp. 1586–1609

work page 2019