A Mathematical Introduction to Diffusion Models

Jianfeng Lu

arxiv: 2607.01693 · v1 · pith:O67NBPB4new · submitted 2026-07-02 · 💻 cs.LG · math.PR

A Mathematical Introduction to Diffusion Models

Jianfeng Lu This is my paper

Pith reviewed 2026-07-03 17:47 UTC · model grok-4.3

classification 💻 cs.LG math.PR

keywords diffusion modelsstochastic differential equationssamplingerror analysisgenerative modelsstochastic processesinference controlreverse process

0 comments

The pith

Diffusion models connect classical sampling dynamics to modern samplers through reversible stochastic processes with layered proofs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper supplies a proof-oriented introduction to diffusion models by tracing an arc from classical sampling dynamics through modern diffusion samplers, their error analysis, and inference-time control. Material is presented in three layers: core definitions and identities proved completely, representative estimates proved under simplifying assumptions, and research-level theorems stated with proof roadmaps. The structure targets readers who know probability but have not encountered stochastic differential equations or diffusion models, so each layer builds directly on the last without external prerequisites. If the presentation succeeds, a reader can derive the reverse sampling process, bound its errors, and apply inference controls from first principles.

Core claim

These notes establish a continuous arc from classical sampling dynamics to contemporary diffusion samplers by deriving the time-reversed stochastic differential equation from the forward noising process, supplying complete proofs of the central identities, representative error bounds under simplified conditions, and roadmaps for advanced results on error analysis and inference-time control.

What carries the argument

The forward diffusion process defined by a stochastic differential equation whose time reversal produces the generative sampling dynamics.

If this is right

Error bounds for diffusion samplers follow from the simplified estimates once the core identities are in place.
Inference-time control emerges as an adjustment to the drift of the reverse process.
Advanced theorems on convergence and discretization become reachable once the basic reversal is proved.
The same layered structure applies to any sampler whose dynamics can be written as a controlled stochastic differential equation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The reversal construction could be tested numerically on low-dimensional Ornstein-Uhlenbeck processes to check whether the derived sampler recovers the correct stationary distribution.
The proof roadmaps suggest that replacing the Gaussian noise schedule with other Lévy processes would require only local changes to the estimates.
The separation between core proofs and research-level results indicates a natural division for course notes or textbook chapters on related generative models.
If the simplified estimates remain accurate outside the stated assumptions, they could serve as quick diagnostics for new diffusion variants.

Load-bearing premise

The reader already knows probability theory but has no prior exposure to stochastic differential equations or diffusion models.

What would settle it

A reader who knows only probability follows the first chapter on the forward process and the derivation of its reverse and cannot obtain the standard Fokker-Planck reversal identity without consulting outside sources.

read the original abstract

These notes give a proof-oriented introduction to diffusion models from the viewpoint of sampling, tracing a single arc from classical sampling dynamics to modern diffusion samplers, their error analysis, and inference-time control. Throughout, the material is layered into core definitions and identities proved in full, representative estimates proved under simplifying assumptions, and research-level theorems stated with a proof roadmap. The intended audience is beginning graduate students with a background in probability but no prior exposure to stochastic differential equations, stochastic numerics, or diffusion models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

These are solid pedagogical notes on diffusion models that layer proofs and roadmaps but contain no new results.

read the letter

These notes are a pedagogical introduction to diffusion models written from the sampling perspective. The key point is that they build a single narrative from classical sampling dynamics up to modern diffusion samplers, covering error analysis and inference-time techniques along the way. The material is organized in layers with full proofs for the basics, simplified estimates for some results, and roadmaps for the more advanced theorems.

There are no new scientific results here. The paper is explicitly an exposition of existing ideas rather than a source of original theorems or experiments. What it does well is the careful layering and the choice of starting point. Beginning with classical methods and moving to diffusion gives readers a sense of where the ideas come from, which can make the modern versions easier to understand. The target audience of grad students with probability knowledge but no SDE background seems well matched to this approach.

The structure should make it easier to follow than reading scattered research papers. Full proofs for core definitions are the right call for notes like this, and providing proof roadmaps for harder parts is a practical way to point advanced readers forward without overwhelming beginners.

The soft spots are limited. Since the work is not claiming new results, there is no central claim that could be wrong in a load-bearing way. The main question is whether the execution of the proofs and the choice of simplifying assumptions hold up in the full text. If the estimates under simplifying assumptions miss key difficulties that appear in the general case, that could reduce the notes' usefulness, but that is a common feature of introductory material and not a major flaw here.

This paper is for students or researchers who want a self-contained mathematical entry to diffusion models. Someone fitting the background description would probably find it helpful for building foundations. It is not the kind of work that would change how I think about the area or that I would cite for a specific result.

I would not recommend sending it for peer review. The right place for this is as teaching notes rather than a journal article that requires referee evaluation for novelty and impact.

Referee Report

0 major / 2 minor

Summary. The manuscript presents pedagogical notes that provide a proof-oriented introduction to diffusion models from the sampling viewpoint. It traces an arc from classical sampling dynamics through modern diffusion samplers, error analysis, and inference-time control. Material is layered into core definitions and identities with full proofs, representative estimates under simplifying assumptions, and research-level theorems stated with proof roadmaps. The intended audience is beginning graduate students with probability background but no prior exposure to SDEs, stochastic numerics, or diffusion models.

Significance. If the layered structure and proofs are executed accurately, the notes could serve as a valuable pedagogical resource that makes the mathematical foundations of diffusion models more accessible. The emphasis on full proofs for core elements and explicit roadmaps for advanced results addresses a common gap in the literature, where introductions are often either informal or assume advanced prerequisites.

minor comments (2)

The abstract describes three distinct layers of material; ensure each section explicitly indicates which layer it belongs to so that readers can navigate the progression as intended.
Verify that all simplifying assumptions for the representative estimates are stated clearly and that their scope is discussed, to avoid potential confusion for the target audience lacking prior SDE exposure.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive and encouraging review. Their recommendation for acceptance is appreciated, and we are pleased that the layered structure and pedagogical focus were viewed favorably.

Circularity Check

0 steps flagged

No circularity: expository notes on prior literature

full rationale

The document is explicitly positioned as pedagogical notes that trace existing literature on sampling dynamics to diffusion models, proving core definitions and identities in full from standard probability while stating research-level theorems only with proof roadmaps. No novel central claim, derivation, or modeling assertion is advanced whose validity depends on internal reduction to fitted inputs or self-citations. All material is drawn from external sources and presented without self-referential loops.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a review and exposition paper, the authors introduce no new free parameters, axioms, or invented entities; all content draws from prior literature.

pith-pipeline@v0.9.1-grok · 5589 in / 927 out tokens · 15850 ms · 2026-07-03T17:47:30.995990+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

52 extracted references · 34 canonical work pages · 14 internal anchors

[1]

M. S. Albergo, N. M. Boffi, and E. Vanden-Eijnden. Stochastic interpolants: a unifying framework for flows and diffusions, 2023. arXiv:2303.08797

work page internal anchor Pith review Pith/arXiv arXiv 2023
[2]

J. M. Altschuler and S. Chewi. Faster high-accuracy log-concave sampling via algorithmic warm starts. InIEEE Symposium on Foundations of Computer Science (FOCS), pages 2169–2176, 2023. arXiv:2302.10249

work page arXiv 2023
[3]

Ambrosio, N

L. Ambrosio, N. Gigli, and G. Savaré.Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics ETH Zürich. Birkhäuser, second edition, 2008

2008
[4]

Anari, C

N. Anari, C. Baronio, CJ Chen, A. Haqi, F . Koehler, A. Li, and T .-D. Vuong. Parallel sampling via autospeculation, 2025. arXiv:2511.07869

work page arXiv 2025
[5]

Anari, R

N. Anari, R. Gao, and A. Rubinstein. Parallel sampling via counting, 2024. arXiv:2408.09442

work page arXiv 2024
[6]

Bakry, I

D. Bakry, I. Gentil, and M. Ledoux.Analysis and Geometry of Markov Diffusion Operators. Grundlehren der mathematischen Wissenschaften 348. Springer, 2014

2014
[7]

Bauerschmidt, T

R. Bauerschmidt, T . Bodineau, and B. Dagallier. Stochastic dynamics and the Polchinski equation: an introduction.Probability Surveys, 21:200–290, 2024. arXiv:2307.07619

work page arXiv 2024
[8]

C. H. L. Beentjes and R. E. Baker. Uniformisation techniques for stochastic simulation of chemical reaction networks.The Jour- nal of Chemical Physics, 150:154107, 2019. arXiv:1811.00948

work page internal anchor Pith review Pith/arXiv arXiv 2019
[9]

Benton, V

J. Benton, V . De Bortoli, A. Doucet, and G. Deligiannidis. Nearlyd-linear convergence bounds for diffusion models via stochastic localization. InInternational Conference on Learning Representations (ICLR), 2024. arXiv:2308.03686

work page arXiv 2024
[10]

Campbell, J

A. Campbell, J. Benton, V . De Bortoli, T . Rainforth, G. Deligiannidis, and A. Doucet. A continuous time framework for discrete denoising models. InAdvances in Neural Information Processing Systems 35, 2022. arXiv:2205.14987

work page arXiv 2022
[11]

F . Chen, S. Chewi, C. Daskalakis, and A. Rakhlin. High-accuracy sampling for diffusion models and log-concave distributions,
[12]

H. Chen, H. Lee, and J. Lu. Improved analysis of score-based generative modeling: user-friendly bounds under minimal smoothness assumptions, 2023. arXiv:2211.01916

work page arXiv 2023
[13]

Chen and L

H. Chen and L. Ying. Convergence analysis of discrete diffusion model: exact implementation through uniformization, 2024. arXiv:2402.08095

work page arXiv 2024
[14]

Y. Chen. An almost constant lower bound of the isoperimetric coefficient in the KLS conjecture.Geometric and Functional Analysis, 31:34–61, 2021. arXiv:2011.13661

work page arXiv 2021
[15]

Y. Chen. Computational and statistical aspects of diffusion models. Lecture notes, course 401-4634-24L, ETH Zürich, Spring 2026, 2026.https://metaphor.ethz.ch/x/2026/fs/401-4634-24L/

2026
[16]

Chen and R

Y. Chen and R. Eldan. Localization schemes: a framework for proving mixing bounds for Markov chains, 2022. arXiv:2203.04163

work page arXiv 2022
[17]

Chen and K

Y. Chen and K. Gatmiry. A simple proof of the mixing of Metropolis-adjusted Langevin algorithm under smoothness and isoperimetry, 2023. arXiv:2304.04095

work page arXiv 2023
[18]

S. Chewi. Log-concave sampling. Book draft, 2026.https://chewisinho.github.io/

2026
[19]

Conforti, A

G. Conforti, A. Durmus, and M. Gentiloni Silveri. KL convergence guarantees for score diffusion models under minimal data assumptions, 2024. arXiv:2308.12240

work page arXiv 2024
[20]

Diffusion Models Beat GANs on Image Synthesis

P . Dhariwal and A. Nichol. Diffusion models beat GANs on image synthesis. InAdvances in Neural Information Processing Sys- tems 34, 2021. arXiv:2105.05233

work page internal anchor Pith review Pith/arXiv arXiv 2021
[21]

W . E, T . Li, and E. Vanden-Eijnden.Applied Stochastic Analysis. Graduate Studies in Mathematics 199. American Mathematical Society, 2019

2019
[22]

R. Eldan. Thin shell implies spectral gap up to polylog via a stochastic localization scheme.Geometric and Functional Analysis, 23:532–569, 2013. arXiv:1203.0893

work page internal anchor Pith review Pith/arXiv arXiv 2013
[23]

Z. Geng, M. Deng, X. Bai, J. Z. Kolter, and K. He. Mean flows for one-step generative modeling, 2025. arXiv:2505.13447

work page internal anchor Pith review Pith/arXiv arXiv 2025
[24]

D. T . Gillespie. Approximate accelerated stochastic simulation of chemically reacting systems.The Journal of Chemical Physics, 115:1716–1733, 2001

2001
[25]

Grassmann

W . Grassmann. Transient solutions in Markovian queues.European Journal of Operational Research, 1(6):396–402, 1977

1977
[26]

J. Ho, A. N. Jain, and P . Abbeel. Denoising diffusion probabilistic models. InAdvances in Neural Information Processing Systems 33, pages 6840–6851, 2020. arXiv:2006.11239

work page internal anchor Pith review Pith/arXiv arXiv 2020
[27]

Classifier-Free Diffusion Guidance

J. Ho and T . Salimans. Classifier-free diffusion guidance. NeurIPS 2021 Workshop on Deep Generative Models, 2022. arXiv:2207.12598

work page internal anchor Pith review Pith/arXiv arXiv 2021
[28]

Hoogeboom, A

E. Hoogeboom, A. A. Gritsenko, J. Bastings, B. Poole, R. van den Berg, and T . Salimans. Autoregressive diffusion models. In International Conference on Learning Representations (ICLR), 2022. arXiv:2110.02037

work page arXiv 2022
[29]

L. P . Kadanoff. Scaling laws for Ising models nearTc .Physics Physique Fizika, 2:263–272, 1966. A MATHEMATICAL INTRODUCTION TO DIFFUSION MODELS 61

1966
[30]

Karatzas and S

I. Karatzas and S. E. Shreve.Brownian Motion and Stochastic Calculus. Graduate Texts in Mathematics 113. Springer, second edition, 1991

1991
[31]

Lavenant and G

H. Lavenant and G. Zanella. Error bounds and optimal schedules for masked diffusions with factorized approximations, 2025. arXiv:2510.25544

work page arXiv 2025
[32]

Y. T . Lee and S. S. Vempala. Eldan’ s stochastic localization and the KLS conjecture: isoperimetry, concentration and mixing,
[33]

Liang, Y

Y. Liang, Y. Liang, L. Lai, and N. Shroff. Discrete diffusion models: novel analysis and new sampler guarantees, 2025. arXiv:2509.16756

work page arXiv 2025
[34]

Flow Matching for Generative Modeling

Y. Lipman, R. T . Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le. Flow matching for generative modeling. InInternational Confer- ence on Learning Representations (ICLR), 2023. arXiv:2210.02747

work page internal anchor Pith review Pith/arXiv arXiv 2023
[35]

X. Liu, C. Gong, and Q. Liu. Flow straight and fast: learning to generate and transfer data with rectified flow, 2022. arXiv:2209.03003

work page internal anchor Pith review Pith/arXiv arXiv 2022
[36]

A. Lou, C. Meng, and S. Ermon. Discrete diffusion modeling by estimating the ratios of the data distribution. InInternational Conference on Machine Learning (ICML), PMLR 235, 2024. arXiv:2310.16834

work page internal anchor Pith review Pith/arXiv arXiv 2024
[37]

S. P . Meyn and R. L. Tweedie.Markov Chains and Stochastic Stability. Cambridge University Press, second edition, 2009

2009
[38]

Montanari

A. Montanari. Sampling, diffusions, and stochastic localization, 2023. arXiv:2305.10690

work page arXiv 2023
[39]

Nisonoff, J

H. Nisonoff, J. Xiong, S. Allenspach, and J. Listgarten. Unlocking guidance for discrete state-space diffusion and flow models. InInternational Conference on Learning Representations (ICLR), 2025. arXiv:2406.01572

work page arXiv 2025
[40]

Øksendal.Stochastic Differential Equations: An Introduction with Applications

B. Øksendal.Stochastic Differential Equations: An Introduction with Applications. Springer, sixth edition, 2003

2003
[41]

Polchinski

J. Polchinski. Renormalization and effective lagrangians.Nuclear Physics B, 231:269–295, 1984

1984
[42]

J. Song, C. Meng, and S. Ermon. Denoising diffusion implicit models. InInternational Conference on Learning Representations (ICLR), 2021. arXiv:2010.02502

work page internal anchor Pith review Pith/arXiv arXiv 2021
[43]

Y. Song, P . Dhariwal, M. Chen, and I. Sutskever. Consistency models. InInternational Conference on Machine Learning (ICML), PMLR 202, 2023. arXiv:2303.01469

work page internal anchor Pith review Pith/arXiv arXiv 2023
[44]

Y. Song, J. Sohl-Dickstein, D. P . Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations. InInternational Conference on Learning Representations (ICLR), 2021. arXiv:2011.13456

work page internal anchor Pith review Pith/arXiv arXiv 2021
[45]

B. Uria, I. Murray, and H. Larochelle. A deep and tractable density estimator. InInternational Conference on Machine Learning (ICML), 2014. arXiv:1310.1757

work page internal anchor Pith review Pith/arXiv arXiv 2014
[46]

S. S. Vempala and A. Wibisono. Rapid convergence of the unadjusted Langevin algorithm: isoperimetry suffices. InAdvances in Neural Information Processing Systems 32, 2019. arXiv:1903.08568

work page arXiv 2019
[47]

Villani.Optimal transport: old and new

C. Villani.Optimal transport: old and new. Springer, 2009

2009
[48]

K. G. Wilson. Renormalization group and critical phenomena. I. Renormalization group and the Kadanoff scaling picture.Phys- ical Review B, 4:3174–3183, 1971

1971
[49]

K. G. Wilson. Renormalization group and critical phenomena. II. Phase-space cell analysis of critical behavior.Physical Review B, 4:3184–3205, 1971

1971
[50]

K. G. Wilson and J. Kogut. The renormalization group and theϵexpansion.Physics Reports, 12:75–199, 1974

1974
[51]

K. Wu, S. Schmidler, and Y. Chen. Minimax mixing time of the Metropolis-adjusted Langevin algorithm for log-concave sam- pling.Journal of Machine Learning Research, 23(270):1–63, 2022. arXiv:2109.13055

work page arXiv 2022
[52]

L. Wu, B. L. Trippe, C. A. Naesseth, D. M. Blei, and J. P . Cunningham. Practical and asymptotically exact conditional sampling in diffusion models. InAdvances in Neural Information Processing Systems 36, 2023. arXiv:2306.17775. MATHEMATICSDEPARTMENT, DUKEUNIVERSITY, BOX90320, DURHAM, NC 27705 USA. Email address:jianfeng@math.duke.edu

work page arXiv 2023

[1] [1]

M. S. Albergo, N. M. Boffi, and E. Vanden-Eijnden. Stochastic interpolants: a unifying framework for flows and diffusions, 2023. arXiv:2303.08797

work page internal anchor Pith review Pith/arXiv arXiv 2023

[2] [2]

J. M. Altschuler and S. Chewi. Faster high-accuracy log-concave sampling via algorithmic warm starts. InIEEE Symposium on Foundations of Computer Science (FOCS), pages 2169–2176, 2023. arXiv:2302.10249

work page arXiv 2023

[3] [3]

Ambrosio, N

L. Ambrosio, N. Gigli, and G. Savaré.Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics ETH Zürich. Birkhäuser, second edition, 2008

2008

[4] [4]

Anari, C

N. Anari, C. Baronio, CJ Chen, A. Haqi, F . Koehler, A. Li, and T .-D. Vuong. Parallel sampling via autospeculation, 2025. arXiv:2511.07869

work page arXiv 2025

[5] [5]

Anari, R

N. Anari, R. Gao, and A. Rubinstein. Parallel sampling via counting, 2024. arXiv:2408.09442

work page arXiv 2024

[6] [6]

Bakry, I

D. Bakry, I. Gentil, and M. Ledoux.Analysis and Geometry of Markov Diffusion Operators. Grundlehren der mathematischen Wissenschaften 348. Springer, 2014

2014

[7] [7]

Bauerschmidt, T

R. Bauerschmidt, T . Bodineau, and B. Dagallier. Stochastic dynamics and the Polchinski equation: an introduction.Probability Surveys, 21:200–290, 2024. arXiv:2307.07619

work page arXiv 2024

[8] [8]

C. H. L. Beentjes and R. E. Baker. Uniformisation techniques for stochastic simulation of chemical reaction networks.The Jour- nal of Chemical Physics, 150:154107, 2019. arXiv:1811.00948

work page internal anchor Pith review Pith/arXiv arXiv 2019

[9] [9]

Benton, V

J. Benton, V . De Bortoli, A. Doucet, and G. Deligiannidis. Nearlyd-linear convergence bounds for diffusion models via stochastic localization. InInternational Conference on Learning Representations (ICLR), 2024. arXiv:2308.03686

work page arXiv 2024

[10] [10]

Campbell, J

A. Campbell, J. Benton, V . De Bortoli, T . Rainforth, G. Deligiannidis, and A. Doucet. A continuous time framework for discrete denoising models. InAdvances in Neural Information Processing Systems 35, 2022. arXiv:2205.14987

work page arXiv 2022

[11] [11]

F . Chen, S. Chewi, C. Daskalakis, and A. Rakhlin. High-accuracy sampling for diffusion models and log-concave distributions,

[12] [12]

H. Chen, H. Lee, and J. Lu. Improved analysis of score-based generative modeling: user-friendly bounds under minimal smoothness assumptions, 2023. arXiv:2211.01916

work page arXiv 2023

[13] [13]

Chen and L

H. Chen and L. Ying. Convergence analysis of discrete diffusion model: exact implementation through uniformization, 2024. arXiv:2402.08095

work page arXiv 2024

[14] [14]

Y. Chen. An almost constant lower bound of the isoperimetric coefficient in the KLS conjecture.Geometric and Functional Analysis, 31:34–61, 2021. arXiv:2011.13661

work page arXiv 2021

[15] [15]

Y. Chen. Computational and statistical aspects of diffusion models. Lecture notes, course 401-4634-24L, ETH Zürich, Spring 2026, 2026.https://metaphor.ethz.ch/x/2026/fs/401-4634-24L/

2026

[16] [16]

Chen and R

Y. Chen and R. Eldan. Localization schemes: a framework for proving mixing bounds for Markov chains, 2022. arXiv:2203.04163

work page arXiv 2022

[17] [17]

Chen and K

Y. Chen and K. Gatmiry. A simple proof of the mixing of Metropolis-adjusted Langevin algorithm under smoothness and isoperimetry, 2023. arXiv:2304.04095

work page arXiv 2023

[18] [18]

S. Chewi. Log-concave sampling. Book draft, 2026.https://chewisinho.github.io/

2026

[19] [19]

Conforti, A

G. Conforti, A. Durmus, and M. Gentiloni Silveri. KL convergence guarantees for score diffusion models under minimal data assumptions, 2024. arXiv:2308.12240

work page arXiv 2024

[20] [20]

Diffusion Models Beat GANs on Image Synthesis

P . Dhariwal and A. Nichol. Diffusion models beat GANs on image synthesis. InAdvances in Neural Information Processing Sys- tems 34, 2021. arXiv:2105.05233

work page internal anchor Pith review Pith/arXiv arXiv 2021

[21] [21]

W . E, T . Li, and E. Vanden-Eijnden.Applied Stochastic Analysis. Graduate Studies in Mathematics 199. American Mathematical Society, 2019

2019

[22] [22]

R. Eldan. Thin shell implies spectral gap up to polylog via a stochastic localization scheme.Geometric and Functional Analysis, 23:532–569, 2013. arXiv:1203.0893

work page internal anchor Pith review Pith/arXiv arXiv 2013

[23] [23]

Z. Geng, M. Deng, X. Bai, J. Z. Kolter, and K. He. Mean flows for one-step generative modeling, 2025. arXiv:2505.13447

work page internal anchor Pith review Pith/arXiv arXiv 2025

[24] [24]

D. T . Gillespie. Approximate accelerated stochastic simulation of chemically reacting systems.The Journal of Chemical Physics, 115:1716–1733, 2001

2001

[25] [25]

Grassmann

W . Grassmann. Transient solutions in Markovian queues.European Journal of Operational Research, 1(6):396–402, 1977

1977

[26] [26]

J. Ho, A. N. Jain, and P . Abbeel. Denoising diffusion probabilistic models. InAdvances in Neural Information Processing Systems 33, pages 6840–6851, 2020. arXiv:2006.11239

work page internal anchor Pith review Pith/arXiv arXiv 2020

[27] [27]

Classifier-Free Diffusion Guidance

J. Ho and T . Salimans. Classifier-free diffusion guidance. NeurIPS 2021 Workshop on Deep Generative Models, 2022. arXiv:2207.12598

work page internal anchor Pith review Pith/arXiv arXiv 2021

[28] [28]

Hoogeboom, A

E. Hoogeboom, A. A. Gritsenko, J. Bastings, B. Poole, R. van den Berg, and T . Salimans. Autoregressive diffusion models. In International Conference on Learning Representations (ICLR), 2022. arXiv:2110.02037

work page arXiv 2022

[29] [29]

L. P . Kadanoff. Scaling laws for Ising models nearTc .Physics Physique Fizika, 2:263–272, 1966. A MATHEMATICAL INTRODUCTION TO DIFFUSION MODELS 61

1966

[30] [30]

Karatzas and S

I. Karatzas and S. E. Shreve.Brownian Motion and Stochastic Calculus. Graduate Texts in Mathematics 113. Springer, second edition, 1991

1991

[31] [31]

Lavenant and G

H. Lavenant and G. Zanella. Error bounds and optimal schedules for masked diffusions with factorized approximations, 2025. arXiv:2510.25544

work page arXiv 2025

[32] [32]

Y. T . Lee and S. S. Vempala. Eldan’ s stochastic localization and the KLS conjecture: isoperimetry, concentration and mixing,

[33] [33]

Liang, Y

Y. Liang, Y. Liang, L. Lai, and N. Shroff. Discrete diffusion models: novel analysis and new sampler guarantees, 2025. arXiv:2509.16756

work page arXiv 2025

[34] [34]

Flow Matching for Generative Modeling

Y. Lipman, R. T . Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le. Flow matching for generative modeling. InInternational Confer- ence on Learning Representations (ICLR), 2023. arXiv:2210.02747

work page internal anchor Pith review Pith/arXiv arXiv 2023

[35] [35]

X. Liu, C. Gong, and Q. Liu. Flow straight and fast: learning to generate and transfer data with rectified flow, 2022. arXiv:2209.03003

work page internal anchor Pith review Pith/arXiv arXiv 2022

[36] [36]

A. Lou, C. Meng, and S. Ermon. Discrete diffusion modeling by estimating the ratios of the data distribution. InInternational Conference on Machine Learning (ICML), PMLR 235, 2024. arXiv:2310.16834

work page internal anchor Pith review Pith/arXiv arXiv 2024

[37] [37]

S. P . Meyn and R. L. Tweedie.Markov Chains and Stochastic Stability. Cambridge University Press, second edition, 2009

2009

[38] [38]

Montanari

A. Montanari. Sampling, diffusions, and stochastic localization, 2023. arXiv:2305.10690

work page arXiv 2023

[39] [39]

Nisonoff, J

H. Nisonoff, J. Xiong, S. Allenspach, and J. Listgarten. Unlocking guidance for discrete state-space diffusion and flow models. InInternational Conference on Learning Representations (ICLR), 2025. arXiv:2406.01572

work page arXiv 2025

[40] [40]

Øksendal.Stochastic Differential Equations: An Introduction with Applications

B. Øksendal.Stochastic Differential Equations: An Introduction with Applications. Springer, sixth edition, 2003

2003

[41] [41]

Polchinski

J. Polchinski. Renormalization and effective lagrangians.Nuclear Physics B, 231:269–295, 1984

1984

[42] [42]

J. Song, C. Meng, and S. Ermon. Denoising diffusion implicit models. InInternational Conference on Learning Representations (ICLR), 2021. arXiv:2010.02502

work page internal anchor Pith review Pith/arXiv arXiv 2021

[43] [43]

Y. Song, P . Dhariwal, M. Chen, and I. Sutskever. Consistency models. InInternational Conference on Machine Learning (ICML), PMLR 202, 2023. arXiv:2303.01469

work page internal anchor Pith review Pith/arXiv arXiv 2023

[44] [44]

Y. Song, J. Sohl-Dickstein, D. P . Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations. InInternational Conference on Learning Representations (ICLR), 2021. arXiv:2011.13456

work page internal anchor Pith review Pith/arXiv arXiv 2021

[45] [45]

B. Uria, I. Murray, and H. Larochelle. A deep and tractable density estimator. InInternational Conference on Machine Learning (ICML), 2014. arXiv:1310.1757

work page internal anchor Pith review Pith/arXiv arXiv 2014

[46] [46]

S. S. Vempala and A. Wibisono. Rapid convergence of the unadjusted Langevin algorithm: isoperimetry suffices. InAdvances in Neural Information Processing Systems 32, 2019. arXiv:1903.08568

work page arXiv 2019

[47] [47]

Villani.Optimal transport: old and new

C. Villani.Optimal transport: old and new. Springer, 2009

2009

[48] [48]

K. G. Wilson. Renormalization group and critical phenomena. I. Renormalization group and the Kadanoff scaling picture.Phys- ical Review B, 4:3174–3183, 1971

1971

[49] [49]

K. G. Wilson. Renormalization group and critical phenomena. II. Phase-space cell analysis of critical behavior.Physical Review B, 4:3184–3205, 1971

1971

[50] [50]

K. G. Wilson and J. Kogut. The renormalization group and theϵexpansion.Physics Reports, 12:75–199, 1974

1974

[51] [51]

K. Wu, S. Schmidler, and Y. Chen. Minimax mixing time of the Metropolis-adjusted Langevin algorithm for log-concave sam- pling.Journal of Machine Learning Research, 23(270):1–63, 2022. arXiv:2109.13055

work page arXiv 2022

[52] [52]

L. Wu, B. L. Trippe, C. A. Naesseth, D. M. Blei, and J. P . Cunningham. Practical and asymptotically exact conditional sampling in diffusion models. InAdvances in Neural Information Processing Systems 36, 2023. arXiv:2306.17775. MATHEMATICSDEPARTMENT, DUKEUNIVERSITY, BOX90320, DURHAM, NC 27705 USA. Email address:jianfeng@math.duke.edu

work page arXiv 2023