Diffusion Processes on Implicit Manifolds

Adam Gosztolai; Clara Grotehans; Pierre Vandergheynst; Victor Kawasaki-Borruat

arxiv: 2604.07213 · v2 · pith:BIDBUGJOnew · submitted 2026-04-08 · 💻 cs.LG · math.PR

Diffusion Processes on Implicit Manifolds

Victor Kawasaki-Borruat , Clara Grotehans , Pierre Vandergheynst , Adam Gosztolai This is my paper

Pith reviewed 2026-05-21 09:32 UTC · model grok-4.3

classification 💻 cs.LG math.PR

keywords diffusion processesimplicit manifoldspoint cloudsproximity graphsstochastic differential equationsmanifold learninggenerative modeling

0 comments

The pith

Diffusion processes defined on point clouds converge in law to their smooth manifold versions as sampling density increases.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a method to run diffusion processes that stay on the underlying manifold of high-dimensional data using only scattered point samples. The approach approximates the generator of the diffusion with a graph connecting nearby points and an operator that identifies the local directions of the manifold. The main result shows that these discrete processes approach the true manifold diffusion as more points are added. This matters because it allows sampling and exploring data manifolds without needing explicit maps or projections. The construction supports numerical simulation of paths confined to the data manifold.

Core claim

We introduce Implicit Manifold-valued Diffusions (IMDs) that define stochastic differential equations in the original high-dimensional space whose solutions evolve intrinsically on the underlying manifold. The construction approximates the infinitesimal generator using a proximity graph over the data points and the carré-du-champ operator, which encodes the local tangent spaces and lifts the intrinsic process into ambient coordinates. As the number of samples grows, the discrete diffusion process converges in law on the space of probability paths to its smooth manifold counterpart, and an Euler-Maruyama scheme enables numerical integration.

What carries the argument

A proximity graph over the data points combined with the carré-du-champ operator, which recovers local tangent spaces and lifts the intrinsic diffusion into the ambient high-dimensional coordinates.

If this is right

An Euler-Maruyama scheme can be used for numerical integration of the IMDs.
In experiments on synthetic manifolds and the MNIST data manifold the simulated paths remain confined to the manifold.
The processes enable guided exploration of the data manifold.
The framework supplies a foundation for manifold-aware sampling and generative modeling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This construction could be combined with existing generative models to enforce manifold structure during sampling steps.
Similar graph-based approximations might extend to other stochastic processes such as jump diffusions on manifolds.
Practical tests on datasets with independently verified manifold structure would provide direct checks on convergence rates.

Load-bearing premise

The data points are sampled densely enough from a smooth manifold so that the proximity graph plus carré-du-champ operator accurately recovers the local tangent spaces and the intrinsic generator without additional geometric primitives.

What would settle it

Simulating paths from the discrete IMD on successively denser samples from a known manifold and checking whether the generated probability paths match those of the true intrinsic diffusion, or diverge when the graph fails to capture the geometry.

Figures

Figures reproduced from arXiv: 2604.07213 by Adam Gosztolai, Clara Grotehans, Pierre Vandergheynst, Victor Kawasaki-Borruat.

**Figure 1.** Figure 1: No prior knowledge of T 2 beyond the samples (gray) is required to compute the displayed Brownian motion (red). In this work, we take an operator-theoretic approach to diffusion on implicit manifolds. Starting from a proximity graph built from the point cloud XN , we consider the associated random walk graph Laplacian LN ; the discrete generator of a local Markov process on the graph. We then show that,… view at source ↗

**Figure 2.** Figure 2: Histogram of the endpoint statistic t = ⟨µ, YT ⟩ ∈ [−1, 1] (blue) under Langevin dynamics computed with IMDs, compared with the theoretical density induced by the von Mises–Fisher distribution (red) on S 7 ⊂ R 8 . Close agreement indicates that the simulated process recovers the target equilibrium law. See Section 6.1 for more details. Geometric limits of graph Laplacians This is the theoretical backbone o… view at source ↗

**Figure 3.** Figure 3: We use the score of a pre-trained SGM sσ(x) ≈ ∇x log pσ(x) as a retraction toward the data manifold. Numerical approximations of IMDs necessarily operate with finite step sizes, whereas the guarantees of Theorems 2 and 3 hold in the infinitesimal limit h → 0. In practice, only applying the EulerMaruyama (E-M) discretization of the IMD SDE may lead to offmanifold deviations due to discretization error, wh… view at source ↗

**Figure 4.** Figure 4: IMDs produce a smooth transition between two dissimilar data points [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Step size is 1e-3, simulated over 5’000 steps. The Laplacian is computed over 10’000 samples. We notice that while the Swiss Roll is not boundaryless, the nearest-neighbour approach to estimating L(Xt) via P ∗ N prevents off-manifold drift. 13 [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

**Figure 6.** Figure 6: Comparison of diffusion trajectories (top row) and corresponding radial errors (bottom row). [PITH_FULL_IMAGE:figures/full_fig_p022_6.png] view at source ↗

**Figure 7.** Figure 7: Comparison of diffusion trajectories on the Swiss roll (top row) and corresponding latent [PITH_FULL_IMAGE:figures/full_fig_p023_7.png] view at source ↗

read the original abstract

High-dimensional data are often assumed to lie on lower-dimensional manifolds. We study how to construct diffusion processes on this data manifold using only point cloud samples and without access to charts, projections, or other geometric primitives. Here, we introduce Implicit Manifold-valued Diffusions (IMDs), a data-driven mathematical formalism for defining stochastic differential equations in the original high-dimensional space that describe drifting Brownian particles evolving intrinsically on the underlying manifold. Our construction hinges on approximating the corresponding infinitesimal generator of the diffusion process using a proximity graph over the data and using the carr\'e-du-champ of the generator, which encodes the local tangent spaces of the manifold and lifts the intrinsic process into ambient coordinates. We show that as the number of samples grows, our discrete diffusion process converges in law on the space of probability paths to its smooth manifold counterpart. We further present an Euler-Maruyama scheme for the numerical integration of IMDs. We validate our framework using numerical experiments on synthetic manifolds and the MNIST data manifold, showing that IMDs remain confined over the manifold and enable its guided exploration. Our work provides the mathematical foundation and practical implementations of diffusion processes on data manifolds, opening new avenues for manifold-aware sampling, exploration, and generative modeling.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Graph-based construction for intrinsic diffusions on implicit manifolds from point clouds is new, but the convergence claim likely needs explicit scaling on graph radius to hold.

read the letter

The key takeaway is that this work gives a graph-based construction for diffusion processes that evolve intrinsically on a manifold but are simulated in the ambient space, using only samples and no explicit geometry. What stands out as new is the combination of a proximity graph to approximate the generator with the carré-du-champ to encode tangent spaces and lift the process. This avoids charts or projections, which is practical for high-dimensional data like images. The convergence claim in law on path space is the main theoretical result, and they back it with an Euler-Maruyama integrator. The paper does a decent job on the practical side. The experiments on synthetic manifolds and the MNIST manifold show that the simulated paths stay on the data manifold and can be guided, which is relevant for generative modeling. This could help organize sampling methods around manifold structure rather than ambient coordinates. On the soft spots, the convergence result is stated as holding when the number of samples grows, but graph Laplacian approximations typically need the neighborhood size to shrink at a specific rate depending on dimension and density to capture the intrinsic geometry. If that scaling isn't made explicit or if the proof assumes it without checking robustness, the limit might not be to the manifold diffusion. The abstract doesn't flag this, so I'd want to see the full assumptions and any error analysis in the paper. The soundness feels a bit thin without those details visible. This paper is for people in manifold learning and diffusion-based generative models who want a more intrinsic approach. A reader working on theoretical aspects of score matching or sampling on manifolds would get the most out of it. It has enough of a new formalism and some validation to merit a serious referee, though revisions on the convergence conditions would be needed. I recommend putting it through peer review rather than desk rejecting it.

Referee Report

1 major / 2 minor

Summary. The paper introduces Implicit Manifold-valued Diffusions (IMDs), a framework for constructing diffusion processes on an unknown manifold from point-cloud samples alone. It approximates the intrinsic generator via a proximity graph and the carré-du-champ operator, which encodes local tangent spaces, then lifts the process into ambient coordinates. The central theoretical claim is that the resulting discrete process converges in law on path space to the smooth manifold diffusion as the number of samples n tends to infinity. An Euler-Maruyama discretization is provided, and the method is illustrated on synthetic manifolds and the MNIST data manifold, where trajectories remain confined to the manifold and permit guided exploration.

Significance. If the convergence result holds under the stated assumptions, the work supplies a mathematically grounded way to perform intrinsic diffusion on data manifolds without charts or explicit geometric primitives. This could support manifold-aware sampling and generative modeling in high-dimensional settings where only samples are available. The numerical validation on MNIST demonstrates practical confinement to the data manifold, which is a concrete strength of the empirical component.

major comments (1)

[Theorem on convergence in law] Theorem on convergence in law (likely §4 or the main result following the generator construction): the statement claims that the discrete process converges in law to the intrinsic manifold diffusion as n→∞, but does not condition on a scaling regime for the proximity-graph radius ε_n (e.g., ε_n→0 with nε_n^{d+2}→∞ or the analogous k-NN rate). Without this explicit regime, the graph-based generator may converge to the ambient Euclidean Laplacian rather than the intrinsic Laplace-Beltrami operator, undermining the path-space limit to the claimed manifold process. This is load-bearing for the central claim.

minor comments (2)

[Abstract and §1] The abstract and introduction refer to “the space of probability paths” without clarifying whether this is the space of continuous paths equipped with the uniform topology or a weaker Skorokhod-type topology; a brief sentence on the precise function space would improve readability.
[Euler-Maruyama scheme] In the Euler-Maruyama scheme section, the step-size h is introduced without an explicit relation to the graph radius ε_n; adding a short remark on how h should scale with ε_n would clarify the discretization error relative to the graph approximation error.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and valuable comments, which help clarify the conditions for our main convergence result. We address the major comment below.

read point-by-point responses

Referee: [Theorem on convergence in law] Theorem on convergence in law (likely §4 or the main result following the generator construction): the statement claims that the discrete process converges in law to the intrinsic manifold diffusion as n→∞, but does not condition on a scaling regime for the proximity-graph radius ε_n (e.g., ε_n→0 with nε_n^{d+2}→∞ or the analogous k-NN rate). Without this explicit regime, the graph-based generator may converge to the ambient Euclidean Laplacian rather than the intrinsic Laplace-Beltrami operator, undermining the path-space limit to the claimed manifold process. This is load-bearing for the central claim.

Authors: We agree that an explicit scaling regime for ε_n is necessary to guarantee convergence of the graph Laplacian to the intrinsic Laplace-Beltrami operator. The proof of the main theorem (Section 4) already assumes ε_n → 0 with n ε_n^{d+2} → ∞ (or the analogous k-NN condition) to obtain the required pointwise and uniform convergence of the discrete generator and carré-du-champ operator; these rates are stated in the technical assumptions and used throughout the error analysis. However, the formal theorem statement itself only mentions n → ∞ without restating the ε_n regime. We will revise the theorem to explicitly condition the path-space convergence on this scaling, thereby making the load-bearing assumption transparent. This is a clarification rather than a change to the result. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external graph approximation theory

full rationale

The paper defines IMDs by constructing a discrete generator from a proximity graph plus carré-du-champ operator on point-cloud data, then proves that the resulting process converges in law on path space to the intrinsic manifold diffusion as n→∞. This limit statement is an asymptotic consistency result that invokes standard conditions on neighborhood scaling (ε_n→0 with nε_n^{d+2}→∞ or equivalent) drawn from the existing literature on graph Laplacians converging to the Laplace-Beltrami operator. No equation reduces the claimed convergence to a fitted parameter, a self-referential definition, or a load-bearing self-citation whose own justification is internal to the present work. The construction is therefore self-contained against external benchmarks in manifold learning and Dirichlet-form theory.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract only; the construction implicitly relies on the existence of a smooth manifold from which points are sampled and on the ability of a proximity graph to approximate the Laplace-Beltrami operator and its carré-du-champ. No explicit free parameters or invented entities are named in the provided text.

axioms (1)

domain assumption Data points are sampled from a smooth Riemannian manifold embedded in Euclidean space
Invoked to justify that the proximity graph recovers local tangent spaces and that the discrete process converges to the intrinsic diffusion.

pith-pipeline@v0.9.0 · 5750 in / 1277 out tokens · 40006 ms · 2026-05-21T09:32:30.641621+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We show that as the number of samples grows, our discrete diffusion process converges in law on the space of probability paths to its smooth manifold counterpart.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the carré-du-champ of the generator, which encodes the local tangent spaces of the manifold

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Generative models on phase space
hep-ph 2026-04 unverdicted novelty 8.0

Generative diffusion and flow models are constructed to remain exactly on the Lorentz-invariant massless N-particle phase space manifold during sampling for particle physics applications.
Neural Point-Forms
cs.LG 2026-05 unverdicted novelty 6.0

Neural point-forms are introduced as permutation-invariant neural layers that output learned form-comparison matrices for point clouds, with a claimed consistency proof under sampling and manifold assumptions and comp...

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages · cited by 2 Pith papers · 3 internal anchors

[1]

Princeton University Press, 2008

P-A Absil, Robert Mahony, and Rodolphe Sepulchre.Optimization algorithms on matrix manifolds. Princeton University Press, 2008

work page 2008
[2]

Manifold learning by mixture models of vaes for inverse problems.Journal of Machine Learning Research, 25(202):1–35, 2024

Giovanni S Alberti, Johannes Hertrich, Matteo Santacesaria, and Silvia Sciutto. Manifold learning by mixture models of vaes for inverse problems.Journal of Machine Learning Research, 25(202):1–35, 2024

work page 2024
[3]

Springer Science & Business Media, 2013

Dominique Bakry, Ivan Gentil, and Michel Ledoux.Analysis and geometry of Markov diffusion operators, volume 348. Springer Science & Business Media, 2013

work page 2013
[4]

Bronstein, Pierre Vandergheynst, and Adam Gosztolai

Jacob Bamberger, Iolo Jones, Dennis Duncan, Michael M. Bronstein, Pierre Vandergheynst, and Adam Gosztolai. Carré du champ flow matching: better quality-generalisation tradeoff in generative models, 2025. URLhttps://arxiv.org/abs/2510.05930

work page arXiv 2025
[5]

Riemannian metric matching for scalable geometric modelling of distributions

Jacob Bamberger, Adam Gosztolai, Pierre Vandergheynst, Michael M Bronstein, and Iolo Jones. Riemannian metric matching for scalable geometric modelling of distributions. InICLR 2026 Workshop on Geometry-grounded Representation Learning and Generative Modeling, 2026

work page 2026
[6]

Laplacian eigenmaps for dimensionality reduction and data representation.Neural computation, 15(6):1373–1396, 2003

Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation.Neural computation, 15(6):1373–1396, 2003

work page 2003
[7]

Semi-supervised learning on riemannian manifolds.Machine learning, 56(1):209–239, 2004

Mikhail Belkin and Partha Niyogi. Semi-supervised learning on riemannian manifolds.Machine learning, 56(1):209–239, 2004

work page 2004
[8]

Towards a theoretical foundation for laplacian-based manifold methods.Journal of Computer and System Sciences, 74(8):1289–1308, 2008

Mikhail Belkin and Partha Niyogi. Towards a theoretical foundation for laplacian-based manifold methods.Journal of Computer and System Sciences, 74(8):1289–1308, 2008

work page 2008
[9]

Sampling and estimation on manifolds using the langevin diffusion.Journal of Machine Learning Research, 26(71):1–50, 2025

Karthik Bharath, Alexander Lewis, Akash Sharma, and Michael V Tretyakov. Sampling and estimation on manifolds using the langevin diffusion.Journal of Machine Learning Research, 26(71):1–50, 2025

work page 2025
[10]

John Wiley & Sons, 2013

Patrick Billingsley.Convergence of probability measures. John Wiley & Sons, 2013

work page 2013
[11]

Dynamical regimes of diffusion models.Nature Communications, 15(1):9957, 2024

Giulio Biroli, Tony Bonnaire, Valentin De Bortoli, and Marc Mézard. Dynamical regimes of diffusion models.Nature Communications, 15(1):9957, 2024

work page 2024
[12]

Stochastic gradient descent on riemannian manifolds.IEEE Transactions on Automatic Control, 58(9):2217–2229, 2013

Silvere Bonnabel. Stochastic gradient descent on riemannian manifolds.IEEE Transactions on Automatic Control, 58(9):2217–2229, 2013

work page 2013
[13]

& Mézard, M.Why Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in TrainingarXiv:2505.17638 [cs]

Tony Bonnaire, Raphaël Urfin, Giulio Biroli, and Marc Mézard. Why diffusion models don’t memorize: The role of implicit dynamical regularization in training.arXiv preprint arXiv:2505.17638, 2025

work page arXiv 2025
[14]

Cambridge University Press, Cambridge, UK (2023)

Nicolas Boumal.An introduction to optimization on smooth manifolds. Cambridge University Press, 2023. doi: 10.1017/9781009166164. URL https://www.nicolasboumal.net/book

work page doi:10.1017/9781009166164 2023
[15]

On the edge of memorization in diffusion models.arXiv preprint arXiv:2508.17689, 2025

Sam Buchanan, Druv Pai, Yi Ma, and Valentin De Bortoli. On the edge of memorization in diffusion models.arXiv preprint arXiv:2508.17689, 2025

work page arXiv 2025
[16]

A graph discretization of the laplace– beltrami operator.Journal of Spectral Theory, 4(4):675–714, 2015

Dmitri Burago, Sergei Ivanov, and Yaroslav Kurylev. A graph discretization of the laplace– beltrami operator.Journal of Spectral Theory, 4(4):675–714, 2015

work page 2015
[17]

Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data

Saptarshi Chakraborty, Quentin Berthet, and Peter L Bartlett. Generalization properties of score-matching diffusion models for intrinsically low-dimensional data.arXiv preprint arXiv:2603.03700, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[18]

Flow matching on general geometries.arXiv preprint arXiv:2302.03660, 2023

Ricky TQ Chen and Yaron Lipman. Flow matching on general geometries.arXiv preprint arXiv:2302.03660, 2023

work page arXiv 2023
[19]

Efficient sampling on riemannian manifolds via langevin mcmc

Xiang Cheng, Jingzhao Zhang, and Suvrit Sra. Efficient sampling on riemannian manifolds via langevin mcmc. InProceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ’22, Red Hook, NY , USA, 2022. Curran Associates Inc. ISBN 9781713871088. 10

work page 2022
[20]

Theory and algorithms for diffusion processes on riemannian manifolds.arXiv preprint arXiv:2204.13665, 2022

Xiang Cheng, Jingzhao Zhang, and Suvrit Sra. Theory and algorithms for diffusion processes on riemannian manifolds.arXiv preprint arXiv:2204.13665, 2022

work page arXiv 2022
[21]

Sinho Chewi.Log-Concave Sampling. 2025. URLhttps://chewisinho.github.io/

work page 2025
[22]

American Mathematical Soc., 1997

Fan RK Chung.Spectral graph theory, volume 92. American Mathematical Soc., 1997

work page 1997
[23]

Diffusion maps.Applied and computational harmonic analysis, 21(1):5–30, 2006

Ronald R Coifman and Stéphane Lafon. Diffusion maps.Applied and computational harmonic analysis, 21(1):5–30, 2006

work page 2006
[24]

Convergence of denoising diffusion models under the manifold hypoth- esis.arXiv preprint arXiv:2208.05314,

Valentin De Bortoli. Convergence of denoising diffusion models under the manifold hypothesis. arXiv preprint arXiv:2208.05314, 2022

work page arXiv 2022
[25]

Riemannian score-based generative modelling.Advances in neural information processing systems, 35:2406–2422, 2022

Valentin De Bortoli, Emile Mathieu, Michael Hutchinson, James Thornton, Yee Whye Teh, and Arnaud Doucet. Riemannian score-based generative modelling.Advances in neural information processing systems, 35:2406–2422, 2022

work page 2022
[26]

Springer, 1992

Manfredo Perdigao Do Carmo and J Flaherty Francis.Riemannian geometry, volume 2. Springer, 1992

work page 1992
[27]

John Wiley & Sons, 2009

Stewart N Ethier and Thomas G Kurtz.Markov processes: characterization and convergence. John Wiley & Sons, 2009

work page 2009
[28]

Testing the manifold hypothesis

Charles Fefferman, Sanjoy Mitter, and Hariharan Narayanan. Testing the manifold hypothesis. Journal of the American Mathematical Society, 29(4):983–1049, 2016

work page 2016
[29]

Data-driven efficient solvers for langevin dynamics on manifold in high dimensions.Applied and Computational Harmonic Analysis, 62:261–309, 2023

Yuan Gao, Jian-Guo Liu, and Nan Wu. Data-driven efficient solvers for langevin dynamics on manifold in high dimensions.Applied and Computational Harmonic Analysis, 62:261–309, 2023

work page 2023
[30]

Continuum limit of total variation on point clouds

Nicolás García Trillos and Dejan Slepˇcev. Continuum limit of total variation on point clouds. Archive for rational mechanics and analysis, 220(1):193–241, 2016

work page 2016
[31]

Nicolás García Trillos, Moritz Gerlach, Matthias Hein, and Dejan Slepˇcev. Error estimates for spectral convergence of the graph laplacian on random geometric graphs toward the laplace– beltrami operator.Foundations of Computational Mathematics, 20(4):827–887, 2020

work page 2020
[32]

Generative learning of densities on manifolds.Computer Methods in Applied Mechanics and Engineering, 446:118266, 2025

Dimitris G Giovanis, Ellis Crabtree, Roger G Ghanem, and Ioannis G Kevrekidis. Generative learning of densities on manifolds.Computer Methods in Applied Mechanics and Engineering, 446:118266, 2025

work page 2025
[33]

American Mathemat- ical Soc., 2009

Alexander Grigoryan.Heat kernel and analysis on manifolds, volume 47. American Mathemat- ical Soc., 2009

work page 2009
[34]

Geometric numerical integration.Oberwolfach Reports, 3(1):805–882, 2006

Ernst Hairer, Marlis Hochbruck, Arieh Iserles, and Christian Lubich. Geometric numerical integration.Oberwolfach Reports, 3(1):805–882, 2006

work page 2006
[35]

Graph laplacians and their convergence on random neighborhood graphs.Journal of Machine Learning Research, 8(6), 2007

Matthias Hein, Jean-Yves Audibert, and Ulrike von Luxburg. Graph laplacians and their convergence on random neighborhood graphs.Journal of Machine Learning Research, 8(6), 2007

work page 2007
[36]

Molecular dynamics simulation for all.Neuron, 99(6): 1129–1143, 2018

Scott A Hollingsworth and Ron O Dror. Molecular dynamics simulation for all.Neuron, 99(6): 1129–1143, 2018

work page 2018
[37]

Number 38

Elton P Hsu.Stochastic analysis on manifolds. Number 38. American Mathematical Soc., 2002

work page 2002
[38]

Estimation of non-normalized statistical models by score matching.Journal of Machine Learning Research, 6(24):695–709, 2005

Aapo Hyvärinen. Estimation of non-normalized statistical models by score matching.Journal of Machine Learning Research, 6(24):695–709, 2005. URL http://jmlr.org/papers/v6/ hyvarinen05a.html

work page 2005
[39]

Springer Science & Business Media, 2011

Jean Jacod and Philip Protter.Discretization of processes, volume 67. Springer Science & Business Media, 2011

work page 2011
[40]

Diffusion geometry, 2024

Iolo Jones. Diffusion geometry, 2024. URLhttps://arxiv.org/abs/2405.10858. 11

work page arXiv 2024
[41]

Computing diffusion geometry.arXiv preprint arXiv:2602.06006, 2026

Iolo Jones and David Lanners. Computing diffusion geometry.arXiv preprint arXiv:2602.06006, 2026

work page arXiv 2026
[42]

Landing with the score: Riemannian optimization through denoising, 2025

Andrey Kharitenko, Zebang Shen, Riccardo de Santi, Niao He, and Florian Doerfler. Landing with the score: Riemannian optimization through denoising, 2025. URL https://arxiv. org/abs/2509.23357

work page arXiv 2025
[43]

Low Rank Approximation Lecture 9

Daniel Kressner. Low Rank Approximation Lecture 9. 2018. URL https://www.epfl.ch/ labs/anchp/wp-content/uploads/2018/12/lecture9-slides.pdf

work page 2018
[44]

Convergence of spectral structures: A functional analytic theory and its applications to spectral geometry.Communications in Analysis and Geometry, 11 (4):599–673, September 2003

Kazuhiro Kuwae and Takashi Shioya. Convergence of spectral structures: A functional analytic theory and its applications to spectral geometry.Communications in Analysis and Geometry, 11 (4):599–673, September 2003. ISSN 1019-8385. doi: 10.4310/CAG.2003.v11.n4.a1

work page doi:10.4310/cag.2003.v11.n4.a1 2003
[45]

John Wiley & Sons, 2014

Peter D Lax.Functional analysis. John Wiley & Sons, 2014

work page 2014
[46]

Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 2002

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 2002

work page 2002
[47]

Diffusion map particle systems for generative modeling.arXiv preprint arXiv:2304.00200, 2023

Fengyi Li and Youssef Marzouk. Diffusion map particle systems for generative modeling.arXiv preprint arXiv:2304.00200, 2023

work page arXiv 2023
[48]

Stochastic lie group integrators.SIAM Journal on Scientific Computing, 30(2):597–617, 2008

Simon JA Malham and Anke Wiese. Stochastic lie group integrators.SIAM Journal on Scientific Computing, 30(2):597–617, 2008

work page 2008
[49]

Stochastic gradient descent as approximate bayesian inference.Journal of Machine Learning Research, 18(134):1–35, 2017

Stephan Mandt, Matthew D Hoffman, and David M Blei. Stochastic gradient descent as approximate bayesian inference.Journal of Machine Learning Research, 18(134):1–35, 2017

work page 2017
[50]

John Wiley & Sons, 2009

Kanti V Mardia and Peter E Jupp.Directional statistics. John Wiley & Sons, 2009

work page 2009
[51]

Mcmc using hamiltonian dynamics.Handbook of markov chain monte carlo, pages 47–95, 2011

Radford M Neal. Mcmc using hamiltonian dynamics.Handbook of markov chain monte carlo, pages 47–95, 2011

work page 2011
[52]

Intrinsic gaussian process on unknown manifolds with probabilistic metrics.Journal of Machine Learning Research, 24(104): 1–42, 2023

Mu Niu, Zhenwen Dai, Pokman Cheung, and Yizhu Wang. Intrinsic gaussian process on unknown manifolds with probabilistic metrics.Journal of Machine Learning Research, 24(104): 1–42, 2023

work page 2023
[53]

Stochastic differential equations

Bernt Øksendal. Stochastic differential equations. InStochastic differential equations: an introduction with applications, pages 38–50. Springer, 2003

work page 2003
[54]

A neural manifold view of the brain

Matthew G Perich, Devika Narain, and Juan A Gallego. A neural manifold view of the brain. Nature Neuroscience, 28(8):1582–1597, 2025

work page 2025
[55]

Score-based generative models detect manifolds.Advances in Neural Information Processing Systems, 35:35852–35865, 2022

Jakiw Pidstrigach. Score-based generative models detect manifolds.Advances in Neural Information Processing Systems, 35:35852–35865, 2022

work page 2022
[56]

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation.CoRR, abs/1505.04597, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[57]

Learning stable robotic skills on riemannian manifolds.Robotics and Autonomous Systems, 169:104510, 2023

Matteo Saveriano, Fares J Abu-Dakka, and Ville Kyrki. Learning stable robotic skills on riemannian manifolds.Robotics and Autonomous Systems, 169:104510, 2023

work page 2023
[58]

Generative modeling by estimating gradients of the data distribution

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019

work page 2019
[59]

Improved techniques for training score-based generative models

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 12438–12448. Curran Associates, Inc., 2020

work page 2020
[60]

Score-Based Generative Modeling through Stochastic Differential Equations

Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations, 2021. URLhttps://arxiv.org/abs/2011.13456. 12

work page internal anchor Pith review Pith/arXiv arXiv 2021
[61]

Springer, 2007

Daniel W Stroock and SR Srinivasa Varadhan.Multidimensional diffusion processes. Springer, 2007

work page 2007
[62]

Iglesias

Johanna Tengler, Christoph Brune, and José A. Iglesias. Manifold limit for the training of shallow graph convolutional neural networks, 2026. URL https://arxiv.org/abs/2601. 06025

work page 2026
[63]

On the rate of convergence of empirical measures in ∞-transportation distance.Canadian Journal of Mathematics, 67(6):1358–1383, 2015

Nicolás Garcia Trillos and Dejan Slepˇcev. On the rate of convergence of empirical measures in ∞-transportation distance.Canadian Journal of Mathematics, 67(6):1358–1383, 2015

work page 2015
[64]

A variational approach to the consistency of spectral clustering.Applied and Computational Harmonic Analysis, 45(2):239–281, 2018

Nicolas Garcia Trillos and Dejan Slepˇcev. A variational approach to the consistency of spectral clustering.Applied and Computational Harmonic Analysis, 45(2):239–281, 2018

work page 2018
[65]

Springer science & business media, 2013

Vladimir Vapnik.The nature of statistical learning theory. Springer science & business media, 2013

work page 2013
[66]

On the convergence of sample probability distributions.Sankhy ¯a: The Indian Journal of Statistics (1933-1960), 19(1/2):23–26, 1958

Veeravalli S Varadarajan. On the convergence of sample probability distributions.Sankhy ¯a: The Indian Journal of Statistics (1933-1960), 19(1/2):23–26, 1958

work page 1933
[67]

Springer, 2009

Cédric Villani et al.Optimal transport: old and new, volume 338. Springer, 2009

work page 2009
[68]

Bayesian learning via stochastic gradient langevin dynamics

Max Welling and Yee W Teh. Bayesian learning via stochastic gradient langevin dynamics. InProceedings of the 28th international conference on machine learning (ICML-11), pages 681–688, 2011

work page 2011
[69]

Spectral convergence of diffusion maps: Improved error bounds and an alternative normalization.SIAM Journal on Numerical Analysis, 59(3): 1687–1734, 2021

Caroline L Wormell and Sebastian Reich. Spectral convergence of diffusion maps: Improved error bounds and an alternative normalization.SIAM Journal on Numerical Analysis, 59(3): 1687–1734, 2021

work page 2021
[70]

Global convergence of langevin dynamics based algorithms for nonconvex optimization.Advances in Neural Information Processing Systems, 31, 2018

Pan Xu, Jinghui Chen, Difan Zou, and Quanquan Gu. Global convergence of langevin dynamics based algorithms for nonconvex optimization.Advances in Neural Information Processing Systems, 31, 2018

work page 2018
[71]

Se (3) diffusion model with application to protein backbone generation

Jason Yim, Brian L Trippe, Valentin De Bortoli, Emile Mathieu, Arnaud Doucet, Regina Barzilay, and Tommi Jaakkola. Se (3) diffusion model with application to protein backbone generation. InProceedings of the 40th International Conference on Machine Learning, pages 40001–40039, 2023

work page 2023
[72]

Olga Zaghen, Floor Eijkelboom, Alison Pouplin, Cong Liu, Max Welling, Jan-Willem van de Meent, and Erik J. Bekkers. Riemannian variational flow matching for material and protein design, 2025. URLhttps://arxiv.org/abs/2502.12981. A Aesthetically pleasing plots (a) 2-dimensional sphere (b) The TorusT 2 (c) Swiss Roll Figure 5: Step size is 1e-3, simulated o...

work page arXiv 2025
[73]

3Note that anyf∈H 1(M)is automatically inC(M)

For allf∈H 1(M),3 we have EN(PN f)≤(1 +C ′ 1ϵ+C ′ 2 ε ϵ +C ′ 3ϵ2) | {z } =:δ′ N E(f),(46) and C ′ 1 =CαL p, C ′ 2 =C d+ 2d+1Lkϵ(1 +αL p) kϵ(1/2) , C ′ 3 =Cd(K+R −2) andCis a universal constant. 3Note that anyf∈H 1(M)is automatically inC(M). 16

work page
[74]

data approximation

For anyu∈H N , we have E(I N u)≤(1 +C ′′ 1 ϵ+C ′′ 2 ε ϵ +C ′′ 3 ϵ2) | {z } =:δ′′ N EN(u),(47) whereI N is the interpolation map and C ′′ 1 =αL p, C ′′ 2 =C(d+C ′ 2), C ′′ 3 = (1 + 1 σkϵ )dK. Corollary 1.We immediately notice that from Appendix C hN ∝ √ϵ, implying that ϵ, ε ϵ and ϵ2 all go to zero as sample sizeN→ ∞, implying that all terms involvingC ′ i ...

work page
[75]

We thus disclaim here that the MNIST experiments should be interpreted as empirical evidence, rather than fully theorem-backed instantiations

indicates the current framework should appeal to such kernels. We thus disclaim here that the MNIST experiments should be interpreted as empirical evidence, rather than fully theorem-backed instantiations. We compute the random walk graph Laplacian in the same way in both cases, using Eq. (14). The CDC operator is computed by considering its action on coo...

work page
[76]

physical time

and inject it as a bias into the first linear layer (MLP for synthetic data) or as a per-channel bias at every convolutional block (U-Net for MNIST). We use a geometric schedule of noise levels with 20 different noise levels ranging (σmin = 0.005, σmax = 1) for the synthetic examples and 100 noise levels ranging (σmin = 0.01, σmax = 15) for MNIST. Trainin...

work page

[1] [1]

Princeton University Press, 2008

P-A Absil, Robert Mahony, and Rodolphe Sepulchre.Optimization algorithms on matrix manifolds. Princeton University Press, 2008

work page 2008

[2] [2]

Manifold learning by mixture models of vaes for inverse problems.Journal of Machine Learning Research, 25(202):1–35, 2024

Giovanni S Alberti, Johannes Hertrich, Matteo Santacesaria, and Silvia Sciutto. Manifold learning by mixture models of vaes for inverse problems.Journal of Machine Learning Research, 25(202):1–35, 2024

work page 2024

[3] [3]

Springer Science & Business Media, 2013

Dominique Bakry, Ivan Gentil, and Michel Ledoux.Analysis and geometry of Markov diffusion operators, volume 348. Springer Science & Business Media, 2013

work page 2013

[4] [4]

Bronstein, Pierre Vandergheynst, and Adam Gosztolai

Jacob Bamberger, Iolo Jones, Dennis Duncan, Michael M. Bronstein, Pierre Vandergheynst, and Adam Gosztolai. Carré du champ flow matching: better quality-generalisation tradeoff in generative models, 2025. URLhttps://arxiv.org/abs/2510.05930

work page arXiv 2025

[5] [5]

Riemannian metric matching for scalable geometric modelling of distributions

Jacob Bamberger, Adam Gosztolai, Pierre Vandergheynst, Michael M Bronstein, and Iolo Jones. Riemannian metric matching for scalable geometric modelling of distributions. InICLR 2026 Workshop on Geometry-grounded Representation Learning and Generative Modeling, 2026

work page 2026

[6] [6]

Laplacian eigenmaps for dimensionality reduction and data representation.Neural computation, 15(6):1373–1396, 2003

Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation.Neural computation, 15(6):1373–1396, 2003

work page 2003

[7] [7]

Semi-supervised learning on riemannian manifolds.Machine learning, 56(1):209–239, 2004

Mikhail Belkin and Partha Niyogi. Semi-supervised learning on riemannian manifolds.Machine learning, 56(1):209–239, 2004

work page 2004

[8] [8]

Towards a theoretical foundation for laplacian-based manifold methods.Journal of Computer and System Sciences, 74(8):1289–1308, 2008

Mikhail Belkin and Partha Niyogi. Towards a theoretical foundation for laplacian-based manifold methods.Journal of Computer and System Sciences, 74(8):1289–1308, 2008

work page 2008

[9] [9]

Sampling and estimation on manifolds using the langevin diffusion.Journal of Machine Learning Research, 26(71):1–50, 2025

Karthik Bharath, Alexander Lewis, Akash Sharma, and Michael V Tretyakov. Sampling and estimation on manifolds using the langevin diffusion.Journal of Machine Learning Research, 26(71):1–50, 2025

work page 2025

[10] [10]

John Wiley & Sons, 2013

Patrick Billingsley.Convergence of probability measures. John Wiley & Sons, 2013

work page 2013

[11] [11]

Dynamical regimes of diffusion models.Nature Communications, 15(1):9957, 2024

Giulio Biroli, Tony Bonnaire, Valentin De Bortoli, and Marc Mézard. Dynamical regimes of diffusion models.Nature Communications, 15(1):9957, 2024

work page 2024

[12] [12]

Stochastic gradient descent on riemannian manifolds.IEEE Transactions on Automatic Control, 58(9):2217–2229, 2013

Silvere Bonnabel. Stochastic gradient descent on riemannian manifolds.IEEE Transactions on Automatic Control, 58(9):2217–2229, 2013

work page 2013

[13] [13]

& Mézard, M.Why Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in TrainingarXiv:2505.17638 [cs]

Tony Bonnaire, Raphaël Urfin, Giulio Biroli, and Marc Mézard. Why diffusion models don’t memorize: The role of implicit dynamical regularization in training.arXiv preprint arXiv:2505.17638, 2025

work page arXiv 2025

[14] [14]

Cambridge University Press, Cambridge, UK (2023)

Nicolas Boumal.An introduction to optimization on smooth manifolds. Cambridge University Press, 2023. doi: 10.1017/9781009166164. URL https://www.nicolasboumal.net/book

work page doi:10.1017/9781009166164 2023

[15] [15]

On the edge of memorization in diffusion models.arXiv preprint arXiv:2508.17689, 2025

Sam Buchanan, Druv Pai, Yi Ma, and Valentin De Bortoli. On the edge of memorization in diffusion models.arXiv preprint arXiv:2508.17689, 2025

work page arXiv 2025

[16] [16]

A graph discretization of the laplace– beltrami operator.Journal of Spectral Theory, 4(4):675–714, 2015

Dmitri Burago, Sergei Ivanov, and Yaroslav Kurylev. A graph discretization of the laplace– beltrami operator.Journal of Spectral Theory, 4(4):675–714, 2015

work page 2015

[17] [17]

Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data

Saptarshi Chakraborty, Quentin Berthet, and Peter L Bartlett. Generalization properties of score-matching diffusion models for intrinsically low-dimensional data.arXiv preprint arXiv:2603.03700, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[18] [18]

Flow matching on general geometries.arXiv preprint arXiv:2302.03660, 2023

Ricky TQ Chen and Yaron Lipman. Flow matching on general geometries.arXiv preprint arXiv:2302.03660, 2023

work page arXiv 2023

[19] [19]

Efficient sampling on riemannian manifolds via langevin mcmc

Xiang Cheng, Jingzhao Zhang, and Suvrit Sra. Efficient sampling on riemannian manifolds via langevin mcmc. InProceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ’22, Red Hook, NY , USA, 2022. Curran Associates Inc. ISBN 9781713871088. 10

work page 2022

[20] [20]

Theory and algorithms for diffusion processes on riemannian manifolds.arXiv preprint arXiv:2204.13665, 2022

Xiang Cheng, Jingzhao Zhang, and Suvrit Sra. Theory and algorithms for diffusion processes on riemannian manifolds.arXiv preprint arXiv:2204.13665, 2022

work page arXiv 2022

[21] [21]

Sinho Chewi.Log-Concave Sampling. 2025. URLhttps://chewisinho.github.io/

work page 2025

[22] [22]

American Mathematical Soc., 1997

Fan RK Chung.Spectral graph theory, volume 92. American Mathematical Soc., 1997

work page 1997

[23] [23]

Diffusion maps.Applied and computational harmonic analysis, 21(1):5–30, 2006

Ronald R Coifman and Stéphane Lafon. Diffusion maps.Applied and computational harmonic analysis, 21(1):5–30, 2006

work page 2006

[24] [24]

Convergence of denoising diffusion models under the manifold hypoth- esis.arXiv preprint arXiv:2208.05314,

Valentin De Bortoli. Convergence of denoising diffusion models under the manifold hypothesis. arXiv preprint arXiv:2208.05314, 2022

work page arXiv 2022

[25] [25]

Riemannian score-based generative modelling.Advances in neural information processing systems, 35:2406–2422, 2022

Valentin De Bortoli, Emile Mathieu, Michael Hutchinson, James Thornton, Yee Whye Teh, and Arnaud Doucet. Riemannian score-based generative modelling.Advances in neural information processing systems, 35:2406–2422, 2022

work page 2022

[26] [26]

Springer, 1992

Manfredo Perdigao Do Carmo and J Flaherty Francis.Riemannian geometry, volume 2. Springer, 1992

work page 1992

[27] [27]

John Wiley & Sons, 2009

Stewart N Ethier and Thomas G Kurtz.Markov processes: characterization and convergence. John Wiley & Sons, 2009

work page 2009

[28] [28]

Testing the manifold hypothesis

Charles Fefferman, Sanjoy Mitter, and Hariharan Narayanan. Testing the manifold hypothesis. Journal of the American Mathematical Society, 29(4):983–1049, 2016

work page 2016

[29] [29]

Data-driven efficient solvers for langevin dynamics on manifold in high dimensions.Applied and Computational Harmonic Analysis, 62:261–309, 2023

Yuan Gao, Jian-Guo Liu, and Nan Wu. Data-driven efficient solvers for langevin dynamics on manifold in high dimensions.Applied and Computational Harmonic Analysis, 62:261–309, 2023

work page 2023

[30] [30]

Continuum limit of total variation on point clouds

Nicolás García Trillos and Dejan Slepˇcev. Continuum limit of total variation on point clouds. Archive for rational mechanics and analysis, 220(1):193–241, 2016

work page 2016

[31] [31]

Nicolás García Trillos, Moritz Gerlach, Matthias Hein, and Dejan Slepˇcev. Error estimates for spectral convergence of the graph laplacian on random geometric graphs toward the laplace– beltrami operator.Foundations of Computational Mathematics, 20(4):827–887, 2020

work page 2020

[32] [32]

Generative learning of densities on manifolds.Computer Methods in Applied Mechanics and Engineering, 446:118266, 2025

Dimitris G Giovanis, Ellis Crabtree, Roger G Ghanem, and Ioannis G Kevrekidis. Generative learning of densities on manifolds.Computer Methods in Applied Mechanics and Engineering, 446:118266, 2025

work page 2025

[33] [33]

American Mathemat- ical Soc., 2009

Alexander Grigoryan.Heat kernel and analysis on manifolds, volume 47. American Mathemat- ical Soc., 2009

work page 2009

[34] [34]

Geometric numerical integration.Oberwolfach Reports, 3(1):805–882, 2006

Ernst Hairer, Marlis Hochbruck, Arieh Iserles, and Christian Lubich. Geometric numerical integration.Oberwolfach Reports, 3(1):805–882, 2006

work page 2006

[35] [35]

Graph laplacians and their convergence on random neighborhood graphs.Journal of Machine Learning Research, 8(6), 2007

Matthias Hein, Jean-Yves Audibert, and Ulrike von Luxburg. Graph laplacians and their convergence on random neighborhood graphs.Journal of Machine Learning Research, 8(6), 2007

work page 2007

[36] [36]

Molecular dynamics simulation for all.Neuron, 99(6): 1129–1143, 2018

Scott A Hollingsworth and Ron O Dror. Molecular dynamics simulation for all.Neuron, 99(6): 1129–1143, 2018

work page 2018

[37] [37]

Number 38

Elton P Hsu.Stochastic analysis on manifolds. Number 38. American Mathematical Soc., 2002

work page 2002

[38] [38]

Estimation of non-normalized statistical models by score matching.Journal of Machine Learning Research, 6(24):695–709, 2005

Aapo Hyvärinen. Estimation of non-normalized statistical models by score matching.Journal of Machine Learning Research, 6(24):695–709, 2005. URL http://jmlr.org/papers/v6/ hyvarinen05a.html

work page 2005

[39] [39]

Springer Science & Business Media, 2011

Jean Jacod and Philip Protter.Discretization of processes, volume 67. Springer Science & Business Media, 2011

work page 2011

[40] [40]

Diffusion geometry, 2024

Iolo Jones. Diffusion geometry, 2024. URLhttps://arxiv.org/abs/2405.10858. 11

work page arXiv 2024

[41] [41]

Computing diffusion geometry.arXiv preprint arXiv:2602.06006, 2026

Iolo Jones and David Lanners. Computing diffusion geometry.arXiv preprint arXiv:2602.06006, 2026

work page arXiv 2026

[42] [42]

Landing with the score: Riemannian optimization through denoising, 2025

Andrey Kharitenko, Zebang Shen, Riccardo de Santi, Niao He, and Florian Doerfler. Landing with the score: Riemannian optimization through denoising, 2025. URL https://arxiv. org/abs/2509.23357

work page arXiv 2025

[43] [43]

Low Rank Approximation Lecture 9

Daniel Kressner. Low Rank Approximation Lecture 9. 2018. URL https://www.epfl.ch/ labs/anchp/wp-content/uploads/2018/12/lecture9-slides.pdf

work page 2018

[44] [44]

Convergence of spectral structures: A functional analytic theory and its applications to spectral geometry.Communications in Analysis and Geometry, 11 (4):599–673, September 2003

Kazuhiro Kuwae and Takashi Shioya. Convergence of spectral structures: A functional analytic theory and its applications to spectral geometry.Communications in Analysis and Geometry, 11 (4):599–673, September 2003. ISSN 1019-8385. doi: 10.4310/CAG.2003.v11.n4.a1

work page doi:10.4310/cag.2003.v11.n4.a1 2003

[45] [45]

John Wiley & Sons, 2014

Peter D Lax.Functional analysis. John Wiley & Sons, 2014

work page 2014

[46] [46]

Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 2002

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 2002

work page 2002

[47] [47]

Diffusion map particle systems for generative modeling.arXiv preprint arXiv:2304.00200, 2023

Fengyi Li and Youssef Marzouk. Diffusion map particle systems for generative modeling.arXiv preprint arXiv:2304.00200, 2023

work page arXiv 2023

[48] [48]

Stochastic lie group integrators.SIAM Journal on Scientific Computing, 30(2):597–617, 2008

Simon JA Malham and Anke Wiese. Stochastic lie group integrators.SIAM Journal on Scientific Computing, 30(2):597–617, 2008

work page 2008

[49] [49]

Stochastic gradient descent as approximate bayesian inference.Journal of Machine Learning Research, 18(134):1–35, 2017

Stephan Mandt, Matthew D Hoffman, and David M Blei. Stochastic gradient descent as approximate bayesian inference.Journal of Machine Learning Research, 18(134):1–35, 2017

work page 2017

[50] [50]

John Wiley & Sons, 2009

Kanti V Mardia and Peter E Jupp.Directional statistics. John Wiley & Sons, 2009

work page 2009

[51] [51]

Mcmc using hamiltonian dynamics.Handbook of markov chain monte carlo, pages 47–95, 2011

Radford M Neal. Mcmc using hamiltonian dynamics.Handbook of markov chain monte carlo, pages 47–95, 2011

work page 2011

[52] [52]

Intrinsic gaussian process on unknown manifolds with probabilistic metrics.Journal of Machine Learning Research, 24(104): 1–42, 2023

Mu Niu, Zhenwen Dai, Pokman Cheung, and Yizhu Wang. Intrinsic gaussian process on unknown manifolds with probabilistic metrics.Journal of Machine Learning Research, 24(104): 1–42, 2023

work page 2023

[53] [53]

Stochastic differential equations

Bernt Øksendal. Stochastic differential equations. InStochastic differential equations: an introduction with applications, pages 38–50. Springer, 2003

work page 2003

[54] [54]

A neural manifold view of the brain

Matthew G Perich, Devika Narain, and Juan A Gallego. A neural manifold view of the brain. Nature Neuroscience, 28(8):1582–1597, 2025

work page 2025

[55] [55]

Score-based generative models detect manifolds.Advances in Neural Information Processing Systems, 35:35852–35865, 2022

Jakiw Pidstrigach. Score-based generative models detect manifolds.Advances in Neural Information Processing Systems, 35:35852–35865, 2022

work page 2022

[56] [56]

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation.CoRR, abs/1505.04597, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[57] [57]

Learning stable robotic skills on riemannian manifolds.Robotics and Autonomous Systems, 169:104510, 2023

Matteo Saveriano, Fares J Abu-Dakka, and Ville Kyrki. Learning stable robotic skills on riemannian manifolds.Robotics and Autonomous Systems, 169:104510, 2023

work page 2023

[58] [58]

Generative modeling by estimating gradients of the data distribution

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019

work page 2019

[59] [59]

Improved techniques for training score-based generative models

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 12438–12448. Curran Associates, Inc., 2020

work page 2020

[60] [60]

Score-Based Generative Modeling through Stochastic Differential Equations

Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations, 2021. URLhttps://arxiv.org/abs/2011.13456. 12

work page internal anchor Pith review Pith/arXiv arXiv 2021

[61] [61]

Springer, 2007

Daniel W Stroock and SR Srinivasa Varadhan.Multidimensional diffusion processes. Springer, 2007

work page 2007

[62] [62]

Iglesias

Johanna Tengler, Christoph Brune, and José A. Iglesias. Manifold limit for the training of shallow graph convolutional neural networks, 2026. URL https://arxiv.org/abs/2601. 06025

work page 2026

[63] [63]

On the rate of convergence of empirical measures in ∞-transportation distance.Canadian Journal of Mathematics, 67(6):1358–1383, 2015

Nicolás Garcia Trillos and Dejan Slepˇcev. On the rate of convergence of empirical measures in ∞-transportation distance.Canadian Journal of Mathematics, 67(6):1358–1383, 2015

work page 2015

[64] [64]

A variational approach to the consistency of spectral clustering.Applied and Computational Harmonic Analysis, 45(2):239–281, 2018

Nicolas Garcia Trillos and Dejan Slepˇcev. A variational approach to the consistency of spectral clustering.Applied and Computational Harmonic Analysis, 45(2):239–281, 2018

work page 2018

[65] [65]

Springer science & business media, 2013

Vladimir Vapnik.The nature of statistical learning theory. Springer science & business media, 2013

work page 2013

[66] [66]

On the convergence of sample probability distributions.Sankhy ¯a: The Indian Journal of Statistics (1933-1960), 19(1/2):23–26, 1958

Veeravalli S Varadarajan. On the convergence of sample probability distributions.Sankhy ¯a: The Indian Journal of Statistics (1933-1960), 19(1/2):23–26, 1958

work page 1933

[67] [67]

Springer, 2009

Cédric Villani et al.Optimal transport: old and new, volume 338. Springer, 2009

work page 2009

[68] [68]

Bayesian learning via stochastic gradient langevin dynamics

Max Welling and Yee W Teh. Bayesian learning via stochastic gradient langevin dynamics. InProceedings of the 28th international conference on machine learning (ICML-11), pages 681–688, 2011

work page 2011

[69] [69]

Spectral convergence of diffusion maps: Improved error bounds and an alternative normalization.SIAM Journal on Numerical Analysis, 59(3): 1687–1734, 2021

Caroline L Wormell and Sebastian Reich. Spectral convergence of diffusion maps: Improved error bounds and an alternative normalization.SIAM Journal on Numerical Analysis, 59(3): 1687–1734, 2021

work page 2021

[70] [70]

Global convergence of langevin dynamics based algorithms for nonconvex optimization.Advances in Neural Information Processing Systems, 31, 2018

Pan Xu, Jinghui Chen, Difan Zou, and Quanquan Gu. Global convergence of langevin dynamics based algorithms for nonconvex optimization.Advances in Neural Information Processing Systems, 31, 2018

work page 2018

[71] [71]

Se (3) diffusion model with application to protein backbone generation

Jason Yim, Brian L Trippe, Valentin De Bortoli, Emile Mathieu, Arnaud Doucet, Regina Barzilay, and Tommi Jaakkola. Se (3) diffusion model with application to protein backbone generation. InProceedings of the 40th International Conference on Machine Learning, pages 40001–40039, 2023

work page 2023

[72] [72]

Olga Zaghen, Floor Eijkelboom, Alison Pouplin, Cong Liu, Max Welling, Jan-Willem van de Meent, and Erik J. Bekkers. Riemannian variational flow matching for material and protein design, 2025. URLhttps://arxiv.org/abs/2502.12981. A Aesthetically pleasing plots (a) 2-dimensional sphere (b) The TorusT 2 (c) Swiss Roll Figure 5: Step size is 1e-3, simulated o...

work page arXiv 2025

[73] [73]

3Note that anyf∈H 1(M)is automatically inC(M)

For allf∈H 1(M),3 we have EN(PN f)≤(1 +C ′ 1ϵ+C ′ 2 ε ϵ +C ′ 3ϵ2) | {z } =:δ′ N E(f),(46) and C ′ 1 =CαL p, C ′ 2 =C d+ 2d+1Lkϵ(1 +αL p) kϵ(1/2) , C ′ 3 =Cd(K+R −2) andCis a universal constant. 3Note that anyf∈H 1(M)is automatically inC(M). 16

work page

[74] [74]

data approximation

For anyu∈H N , we have E(I N u)≤(1 +C ′′ 1 ϵ+C ′′ 2 ε ϵ +C ′′ 3 ϵ2) | {z } =:δ′′ N EN(u),(47) whereI N is the interpolation map and C ′′ 1 =αL p, C ′′ 2 =C(d+C ′ 2), C ′′ 3 = (1 + 1 σkϵ )dK. Corollary 1.We immediately notice that from Appendix C hN ∝ √ϵ, implying that ϵ, ε ϵ and ϵ2 all go to zero as sample sizeN→ ∞, implying that all terms involvingC ′ i ...

work page

[75] [75]

We thus disclaim here that the MNIST experiments should be interpreted as empirical evidence, rather than fully theorem-backed instantiations

indicates the current framework should appeal to such kernels. We thus disclaim here that the MNIST experiments should be interpreted as empirical evidence, rather than fully theorem-backed instantiations. We compute the random walk graph Laplacian in the same way in both cases, using Eq. (14). The CDC operator is computed by considering its action on coo...

work page

[76] [76]

physical time

and inject it as a bias into the first linear layer (MLP for synthetic data) or as a per-channel bias at every convolutional block (U-Net for MNIST). We use a geometric schedule of noise levels with 20 different noise levels ranging (σmin = 0.005, σmax = 1) for the synthetic examples and 100 noise levels ranging (σmin = 0.01, σmax = 15) for MNIST. Trainin...

work page