Recognition: unknown
Continuous Limits of Coupled Flows in Representation Learning
Pith reviewed 2026-05-10 07:02 UTC · model grok-4.3
The pith
Discrete decentralized learning on manifolds converges to orthogonally disentangled features.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Formalizing decentralized learning as a coupled slow-fast dynamical system on Riemannian manifolds, the discrete spatial transitions converge uniformly to an overdamped Langevin stochastic differential equation. Via the Itô-Poisson resolvent and a stochastic extension of LaSalle's Invariance Principle, the representation weights unconditionally avoid divergence and align strictly with the principal eigenspace of the spatial measure. A joint Lyapunov functional for the fully coupled spatial-parametric flow proves global dissipativity, with orthogonally disentangled, linearly separable features emerging at the stationary limit.
What carries the argument
The joint Lyapunov functional for the fully coupled spatial-parametric flow, which establishes global dissipativity via the Itô-Poisson resolvent and stochastic LaSalle invariance.
Load-bearing premise
The discrete spatial transitions converge uniformly to an overdamped Langevin stochastic differential equation via measure-theoretic limits on the Riemannian manifold.
What would settle it
A simulation or analytic counterexample in which the discrete coupled flow diverges or fails to align weights with the principal eigenspace of the spatial measure.
Figures
read the original abstract
While modern representation learning relies heavily on global error signals, decentralized algorithms driven by local interactions offer a fundamental distributed alternative. However, the macroscopic convergence properties of these discrete dynamics on continuous data manifolds remain theoretically unresolved, notoriously suffering from parameter explosion. We bridge this gap by formalizing decentralized learning as a coupled slow-fast dynamical system on Riemannian manifolds. First, using measure-theoretic limits, we prove that the discrete spatial transitions converge uniformly to an overdamped Langevin stochastic differential equation. Second, via the It\^o-Poisson resolvent and a stochastic extension of LaSalle's Invariance Principle, we establish that the representation weights unconditionally avoid divergence and align strictly with the principal eigenspace of the spatial measure. Finally, we construct a joint Lyapunov functional for the fully coupled spatial-parametric flow. This proves global dissipativity and demonstrates that orthogonally disentangled, linearly separable features emerge spontaneously at the stationary limit. Our framework bridges discrete algorithms with continuous stochastic analysis, providing a formal theoretical baseline for decentralized representation learning.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript formalizes decentralized representation learning as a coupled slow-fast dynamical system on Riemannian manifolds. It claims three main results: (1) using measure-theoretic limits, discrete spatial transitions converge uniformly to an overdamped Langevin SDE; (2) via the Itô-Poisson resolvent and stochastic LaSalle invariance, representation weights unconditionally avoid divergence and align with the principal eigenspace of the spatial measure; (3) a joint Lyapunov functional for the fully coupled flow proves global dissipativity and shows that orthogonally disentangled, linearly separable features emerge spontaneously at the stationary limit.
Significance. If the central derivations hold with the required technical conditions, the work would provide a valuable theoretical bridge between discrete decentralized algorithms and continuous stochastic analysis on manifolds. It offers a formal baseline for understanding convergence and the spontaneous emergence of disentangled representations without global error signals, potentially informing the design of local-interaction methods that avoid parameter explosion.
major comments (2)
- [Abstract] Abstract, first proof step: the claim that discrete spatial transitions converge uniformly to the overdamped Langevin SDE via measure-theoretic limits is load-bearing for all subsequent results, yet the abstract invokes only “measure-theoretic limits” without stating the requisite Lipschitz or moment conditions on the coupled interaction kernels, tightness of transition measures, or equicontinuity in local charts with respect to the manifold metric. No error bounds or assumption checks are supplied.
- [Abstract] Abstract, second and third steps: the application of the Itô-Poisson resolvent plus stochastic LaSalle to establish unconditional alignment, and the construction of the joint Lyapunov functional for global dissipativity, are derived inside the continuous limit; any gap in the discrete-to-continuous passage therefore propagates directly to the headline claims of spontaneous orthogonal disentanglement and linear separability.
minor comments (1)
- [Abstract] The LaTeX fragment “It^o-Poisson” should be rendered as “Itô-Poisson” for typographic correctness.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review of our manuscript. We address each major comment point by point below, indicating where revisions will be incorporated.
read point-by-point responses
-
Referee: [Abstract] Abstract, first proof step: the claim that discrete spatial transitions converge uniformly to the overdamped Langevin SDE via measure-theoretic limits is load-bearing for all subsequent results, yet the abstract invokes only “measure-theoretic limits” without stating the requisite Lipschitz or moment conditions on the coupled interaction kernels, tightness of transition measures, or equicontinuity in local charts with respect to the manifold metric. No error bounds or assumption checks are supplied.
Authors: The abstract is a high-level summary. The full technical conditions—Lipschitz continuity and moment bounds on the interaction kernels (Assumption 3.1), tightness of transition measures, and equicontinuity in local charts—are stated explicitly in Section 3. Theorem 3.2 proves uniform convergence to the overdamped Langevin SDE with explicit error bounds of order O(Δt + h), where Δt is the time step and h the spatial discretization parameter. We will revise the abstract to reference these conditions and the error bound for clarity. revision: yes
-
Referee: [Abstract] Abstract, second and third steps: the application of the Itô-Poisson resolvent plus stochastic LaSalle to establish unconditional alignment, and the construction of the joint Lyapunov functional for global dissipativity, are derived inside the continuous limit; any gap in the discrete-to-continuous passage therefore propagates directly to the headline claims of spontaneous orthogonal disentanglement and linear separability.
Authors: The second and third results are indeed derived for the limiting SDE. However, Section 3 establishes the uniform convergence under the stated assumptions, and the appendix verifies that the discrete system satisfies the prerequisites for the Itô-Poisson resolvent and stochastic LaSalle invariance to carry over. The joint Lyapunov functional is constructed to be preserved in the limit, so the claims on alignment and disentanglement hold for the original discrete dynamics via the convergence theorem. We see no unaddressed gap. revision: no
Circularity Check
No circularity: derivations invoke external theorems without self-referential reduction.
full rationale
The paper's chain proceeds by (1) invoking measure-theoretic limits to obtain uniform convergence of discrete transitions to an overdamped Langevin SDE on a Riemannian manifold, (2) applying the Itô-Poisson resolvent together with a stochastic extension of LaSalle's invariance principle to obtain unconditional alignment with the principal eigenspace, and (3) constructing a joint Lyapunov functional to prove global dissipativity and spontaneous orthogonal disentanglement. Each step relies on standard, externally established mathematical machinery (SDE convergence theory, stochastic LaSalle) whose statements and proofs are independent of the present paper's fitted quantities or definitions. No equation is shown to equal its own input by construction, no parameter is fitted on a subset and then relabeled a prediction, and no load-bearing premise reduces to a self-citation whose own justification is internal. The framework therefore remains non-circular.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Data points lie on a Riemannian manifold
- ad hoc to paper Measure-theoretic limits of discrete transitions exist and are uniform
Reference graph
Works this paper leans on
-
[1]
Springer Science & Business Media, 2012
Ralph Abraham, Jerrold E Marsden, and Tudor Ratiu.Manifolds, Tensor Analysis, and Applications, volume 75. Springer Science & Business Media, 2012
2012
-
[2]
Princeton University Press, 2009
P-A Absil, Robert Mahony, and Rodolphe Sepulchre.Optimization algorithms on matrix manifolds. Princeton University Press, 2009
2009
-
[3]
Birkhäuser, 2008
Luigi Ambrosio, Nicola Gigli, and Giuseppe Savaré.Gradient Flows: In Metric Spaces and in the Space of Probability Measures. Birkhäuser, 2008
2008
-
[4]
A theoretical analysis of contrastive unsupervised representation learning
Sanjeev Arora, Hrishikesh Khandeparkar, Mikhail Khodak, Orestis Plevrakis, and Nikunj Saunshi. A theoretical analysis of contrastive unsupervised representation learning. In International Conference on Machine Learning, pages 5628–5637. PMLR, 2019
2019
-
[5]
Springer, 2014
Dominique Bakry, Ivan Gentil, and Michel Ledoux.Analysis and Geometry of Markov Diffusion Operators, volume 348. Springer, 2014
2014
-
[6]
Neural networks and principal component analysis: Learning from examples without local minima.Neural networks, 2(1):53–58, 1989
Pierre Baldi and Kurt Hornik. Neural networks and principal component analysis: Learning from examples without local minima.Neural networks, 2(1):53–58, 1989
1989
-
[7]
Laplacian eigenmaps and spectral techniques for embedding and clustering
Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. InAdvances in neural information processing systems, pages 585–591, 2001
2001
-
[8]
Representation learning: A review and new perspectives.IEEE transactions on pattern analysis and machine intelligence, 35(8): 1798–1828, 2013
Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation learning: A review and new perspectives.IEEE transactions on pattern analysis and machine intelligence, 35(8): 1798–1828, 2013
2013
-
[9]
Prentice hall, 1989
Dimitri P Bertsekas and John N Tsitsiklis.Parallel and distributed computation: numerical methods. Prentice hall, 1989
1989
-
[10]
John Wiley & Sons, 3rd edition, 1995
Patrick Billingsley.Probability and Measure. John Wiley & Sons, 3rd edition, 1995
1995
-
[11]
Springer Science & Business Media, 2007
Vladimir I Bogachev.Measure theory, volume 1. Springer Science & Business Media, 2007
2007
-
[12]
Springer, 2009
Vivek S Borkar.Stochastic approximation: a dynamical systems viewpoint. Springer, 2009
2009
-
[13]
Oxford University Press, 2013
Stéphane Boucheron, Gábor Lugosi, and Pascal Massart.Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013
2013
-
[14]
Geometric deep learning: going beyond euclidean data.IEEE signal processing magazine, 34 (4):18–42, 2017
Michael M Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheynst. Geometric deep learning: going beyond euclidean data.IEEE signal processing magazine, 34 (4):18–42, 2017
2017
-
[15]
Springer Science & Business Media, 2002
Han-Fu Chen.Stochastic Approximation and Its Applications, volume 64. Springer Science & Business Media, 2002
2002
-
[16]
Neural ordinary differential equations
Ricky TQ Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. Neural ordinary differential equations. InAdvances in neural information processing systems, volume 31, 2018
2018
-
[17]
A simple framework for contrastive learning of visual representations
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InInternational conference on machine learning, pages 1597–1607. PMLR, 2020
2020
-
[18]
On the global convergence of gradient descent for over- parameterized models using optimal transport
Lénaïc Chizat and Francis Bach. On the global convergence of gradient descent for over- parameterized models using optimal transport. InAdvances in neural information processing systems, volume 31, 2018
2018
-
[19]
American Mathematical Society, 1997
Fan RK Chung.Spectral Graph Theory, volume 92. American Mathematical Society, 1997
1997
-
[20]
Diffusion maps.Applied and computational harmonic analysis, 21(1):5–30, 2006
Ronald R Coifman and Stéphane Lafon. Diffusion maps.Applied and computational harmonic analysis, 21(1):5–30, 2006
2006
-
[21]
Cambridge university press, 2014
Giuseppe Da Prato and Jerzy Zabczyk.Stochastic equations in infinite dimensions. Cambridge university press, 2014. 10
2014
-
[22]
Random geometric graphs.Physical review E, 66(1): 016121, 2002
Jesper Dall and Michael Christensen. Random geometric graphs.Physical review E, 66(1): 016121, 2002
2002
-
[23]
The rotation of eigenvectors by a perturbation
Chandler Davis and William M Kahan. The rotation of eigenvectors by a perturbation. iii. SIAM Journal on Numerical Analysis, 7(1):1–46, 1970
1970
-
[24]
Bert: Pre-training of deep bidirectional transformers for language understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pages 4171–4186, 2019
2019
-
[25]
Untangling invariant object recognition.Trends in cognitive sciences, 11(8):333–341, 2007
James J DiCarlo and David D Cox. Untangling invariant object recognition.Trends in cognitive sciences, 11(8):333–341, 2007. doi: 10.1016/j.tics.2007.06.010
-
[26]
American Mathematical Society, 1977
Joseph Diestel and John Jerry Uhl.Vector Measures, volume 15 ofMathematical Surveys and Monographs. American Mathematical Society, 1977
1977
-
[27]
Springer-Verlag Heidelberg, 5th edition, 2017
Reinhard Diestel.Graph Theory. Springer-Verlag Heidelberg, 5th edition, 2017
2017
-
[28]
Gossip algorithms for distributed signal processing.Proceedings of the IEEE, 98 (11):1847–1864, 2010
Alexandros G Dimakis, Soummya Kar, José MF Moura, Michael G Rabbat, and Anna Scaglione. Gossip algorithms for distributed signal processing.Proceedings of the IEEE, 98 (11):1847–1864, 2010
2010
-
[29]
Birkhäuser, 1992
Manfredo Perdigao Do Carmo.Riemannian Geometry. Birkhäuser, 1992
1992
-
[30]
An image is worth 16x16 words: Transformers for image recognition at scale
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR), 2021
2021
-
[31]
Cambridge University Press, 2002
Richard M Dudley.Real Analysis and Probability. Cambridge University Press, 2002
2002
-
[32]
The geometry of algorithms with orthogonality constraints.SIAM journal on Matrix Analysis and Applications, 20(2):303–353, 1998
Alan Edelman, Tomás A Arias, and Steven T Smith. The geometry of algorithms with orthogonality constraints.SIAM journal on Matrix Analysis and Applications, 20(2):303–353, 1998
1998
-
[33]
John Wiley & Sons, 2009
Stewart N Ethier and Thomas G Kurtz.Markov Processes: Characterization and Convergence. John Wiley & Sons, 2009
2009
-
[34]
American Mathematical Society, 2nd edition, 2010
Lawrence C Evans.Partial Differential Equations, volume 19. American Mathematical Society, 2nd edition, 2010
2010
-
[35]
Springer, 1969
Herbert Federer.Geometric Measure Theory. Springer, 1969
1969
-
[36]
Testing the manifold hypothesis
Charles Fefferman, Sanjoy Mitter, and Hariharan Narayanan. Testing the manifold hypothesis. Journal of the American Mathematical Society, 29(4):983–1049, 2016
2016
-
[37]
John Wiley & Sons, 2nd edition, 1999
Gerald B Folland.Real Analysis: Modern Techniques and Their Applications. John Wiley & Sons, 2nd edition, 1999
1999
-
[38]
Springer Science & Business Media, 3rd edition, 2012
Mark I Freidlin and Alexander D Wentzell.Random Perturbations of Dynamical Systems, volume 260. Springer Science & Business Media, 3rd edition, 2012
2012
-
[39]
Springer Berlin, 3rd edition, 2004
Crispin W Gardiner.Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences, volume 13. Springer Berlin, 3rd edition, 2004
2004
-
[40]
JHU press, 2013
Gene H Golub and Charles F Van Loan.Matrix computations. JHU press, 2013
2013
-
[41]
MIT press, 2016
Ian Goodfellow, Yoshua Bengio, and Aaron Courville.Deep learning. MIT press, 2016
2016
-
[42]
Ian J Goodfellow, Mehdi Mirza, Da Xiao, Aaron Courville, and Yoshua Bengio. An empirical investigation of catastrophic forgetting in gradient-based neural networks.arXiv preprint arXiv:1312.6211, 2013
-
[43]
Springer Science & Business Media, 2nd edition, 2004
Alfred Gray.Tubes, volume 221. Springer Science & Business Media, 2nd edition, 2004. 11
2004
-
[44]
The critical power for asymptotic connectivity in wireless networks.Stochastic Analysis and Applications, 16(3):547–566, 1998
Piyush Gupta and Panganamala R Kumar. The critical power for asymptotic connectivity in wireless networks.Stochastic Analysis and Applications, 16(3):547–566, 1998
1998
-
[45]
Springer, 2015
Brian C Hall.Lie groups, Lie algebras, and representations: an elementary introduction, volume 222. Springer, 2015
2015
-
[46]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016
2016
-
[47]
Momentum contrast for unsupervised visual representation learning
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020
2020
-
[48]
From graphs to manifolds–weak and strong pointwise consistency of graph laplacians
Matthias Hein, Jean-Yves Audibert, and Ulrike V on Luxburg. From graphs to manifolds–weak and strong pointwise consistency of graph laplacians. InLearning Theory: 18th Annual Conference on Learning Theory, COLT 2005, pages 470–485. Springer, 2005
2005
-
[49]
Towards a Definition of Disentangled Representations
Irina Higgins, David Amos, David Pfau, Demis Hassabis, Matthew Botvinick, and Danilo Rezende. Towards a definition of disentangled representations.arXiv preprint arXiv:1812.02230, 2018
work page Pith review arXiv 2018
-
[50]
Academic press, 2012
Morris W Hirsch, Stephen Smale, and Robert L Devaney.Differential equations, dynamical systems, and an introduction to chaos. Academic press, 2012
2012
-
[51]
Neural networks and physical systems with emergent collective computational abilities.Proceedings of the national academy of sciences, 79(8):2554–2558, 1982
John J Hopfield. Neural networks and physical systems with emergent collective computational abilities.Proceedings of the national academy of sciences, 79(8):2554–2558, 1982
1982
-
[52]
Cambridge University Press, 2nd edition, 2012
Roger A Horn and Charles R Johnson.Matrix Analysis. Cambridge University Press, 2nd edition, 2012
2012
-
[53]
American Mathematical Society, 2002
Elton P Hsu.Stochastic Analysis on Manifolds, volume 38. American Mathematical Society, 2002
2002
-
[54]
Independent component analysis: algorithms and applications
Aapo Hyvärinen and Erkki Oja. Independent component analysis: algorithms and applications. Neural networks, 13(4-5):411–430, 2000
2000
-
[55]
Understanding dimensional collapse in contrastive self-supervised learning
Li Jing, Pascal Vincent, Yann LeCun, and Yuandong Tian. Understanding dimensional collapse in contrastive self-supervised learning. InInternational Conference on Learning Representations, 2022
2022
-
[56]
The variational formulation of the fokker–planck equation.SIAM journal on mathematical analysis, 29(1):1–17, 1998
Richard Jordan, David Kinderlehrer, and Felix Otto. The variational formulation of the fokker–planck equation.SIAM journal on mathematical analysis, 29(1):1–17, 1998
1998
-
[57]
Springer Science & Business Media, 5th edition, 2008
Jürgen Jost.Riemannian Geometry and Geometric Analysis. Springer Science & Business Media, 5th edition, 2008
2008
-
[58]
Springer Science & Business Media, 2nd edition, 1991
Ioannis Karatzas and Steven Shreve.Brownian Motion and Stochastic Calculus, volume 113. Springer Science & Business Media, 2nd edition, 1991
1991
-
[59]
Springer Science & Business Media, 2013
Tosio Kato.Perturbation theory for linear operators, volume 132. Springer Science & Business Media, 2013
2013
-
[60]
John Wiley & Sons, 1979
Frank P Kelly.Reversibility and Stochastic Networks. John Wiley & Sons, 1979
1979
-
[61]
Prentice hall Upper Saddle River, NJ, 2002
Hassan K Khalil.Nonlinear systems, volume 3. Prentice hall Upper Saddle River, NJ, 2002
2002
-
[62]
Springer Science & Business Media, 2011
Rafail Khasminskii.Stochastic stability of differential equations, volume 66. Springer Science & Business Media, 2011
2011
-
[63]
On the principle of averaging for itô’s stochastic differential equations
Rafail Z Khasminskii. On the principle of averaging for itô’s stochastic differential equations. Kibernetika, 4(3):260–279, 1968
1968
-
[64]
Springer, 1992
Peter E Kloeden and Eckhard Platen.Numerical solution of stochastic differential equations. Springer, 1992. 12
1992
-
[65]
MIT Press, 1984
Harold J Kushner.Approximation and Weak Convergence Methods for Random Processes, with Applications to Stochastic Systems Theory. MIT Press, 1984
1984
-
[66]
Springer, 2nd edition, 2003
Harold J Kushner and G George Yin.Stochastic Approximation and Recursive Algorithms and Applications. Springer, 2nd edition, 2003
2003
-
[67]
Some extensions of liapunov’s second method.IRE Transactions on circuit theory, 7(4):520–527, 1960
Joseph P LaSalle. Some extensions of liapunov’s second method.IRE Transactions on circuit theory, 7(4):520–527, 1960
1960
-
[68]
Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998
1998
-
[69]
A tutorial on energy-based learning
Yann LeCun, Sumit Chopra, Raia Hadsell, Marc’Aurelio Ranzato, and Fu-Jie Huang. A tutorial on energy-based learning. In Gökhan H. Bakır, Thomas Hofmann, Bernhard Schölkopf, Alexander J. Smola, and Ben Taskar, editors,Predicting Structured Data, pages 191–246. MIT Press, 2006
2006
-
[70]
Springer, 2018
John M Lee.Introduction to Riemannian Manifolds. Springer, 2018
2018
-
[71]
American Mathematical Society, 2017
David A Levin and Yuval Peres.Markov Chains and Mixing Times, volume 107. American Mathematical Society, 2017
2017
-
[72]
Neuro-vesicles: Neuromodulation should be a dynamical system, not a tensor decoration, 2025
Zilin Li, Weiwei Xu, and Vicki Kane. Neuro-vesicles: Neuromodulation should be a dynamical system, not a tensor decoration, 2025. URLhttps://arxiv.org/abs/2512.06966
-
[73]
Backpropagation and the brain.Nature Reviews Neuroscience, 21(6):335–346, 2020
Timothy P Lillicrap, Adam Santoro, Luke Marris, Colin J Akerman, and Geoffrey Hinton. Backpropagation and the brain.Nature Reviews Neuroscience, 21(6):335–346, 2020
2020
-
[74]
Self-organization in a perceptual network.Computer, 21(3):105–117, 1988
Ralph Linsker. Self-organization in a perceptual network.Computer, 21(3):105–117, 1988
1988
-
[75]
Challenging common assumptions in the unsupervised learning of disentangled representations
Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Raetsch, Sylvain Gelly, Bernhard Schölkopf, and Olivier Bachem. Challenging common assumptions in the unsupervised learning of disentangled representations. InInternational conference on machine learning, pages 4114–4124. PMLR, 2019
2019
-
[76]
John Wiley & Sons, 3rd edition, 2019
Jan R Magnus and Heinz Neudecker.Matrix Differential Calculus with Applications in Statistics and Econometrics. John Wiley & Sons, 3rd edition, 2019
2019
-
[77]
Stochastic versions of the lasalle theorem.Journal of Differential Equations, 153(1):175–195, 1999
Xuerong Mao. Stochastic versions of the lasalle theorem.Journal of Differential Equations, 153(1):175–195, 1999
1999
-
[78]
A mean field view of the landscape of two-layer neural networks.Proceedings of the National Academy of Sciences, 115(33): E7665–E7671, 2018
Song Mei, Andrea Montanari, and Phan-Minh Nguyen. A mean field view of the landscape of two-layer neural networks.Proceedings of the National Academy of Sciences, 115(33): E7665–E7671, 2018
2018
-
[79]
SIAM, 2000
Carl D Meyer.Matrix analysis and applied linear algebra, volume 71. SIAM, 2000
2000
-
[80]
World Scientific Publishing Company, 1987
Marc Mézard, Giorgio Parisi, and Miguel Angel Virasoro.Spin glass theory and beyond: An Introduction to the Replica Method and Its Applications, volume 9. World Scientific Publishing Company, 1987
1987
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.