Why DDIM Hallucinates More Than DDPM: A Theoretical Analysis of Reverse Dynamics

Abhinav N. Harish; Grigorios G. Chrysos; Hung Yun Tseng; Ishaan Kharbanda; Muhammad H. Ashiq; Samanyu Arora

arxiv: 2605.06831 · v2 · pith:QVISCZETnew · submitted 2026-05-07 · 💻 cs.LG · cs.AI

Why DDIM Hallucinates More Than DDPM: A Theoretical Analysis of Reverse Dynamics

Muhammad H. Ashiq , Samanyu Arora , Abhinav N. Harish , Ishaan Kharbanda , Hung Yun Tseng , Grigorios G. Chrysos This is my paper

Pith reviewed 2026-06-30 23:09 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords DDPMDDIMhallucinationdiffusion modelsreverse dynamicsGaussian mixturestochastic sampling

0 comments

The pith

DDIM can become stuck on the segment between nearest modes in a Gaussian mixture after a critical time τ, while DDPM stochasticity allows escape and avoids hallucination.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper analyzes the reverse processes of DDPM and DDIM for Gaussian mixture targets. It proves that after a critical time, the deterministic DDIM can trap samples on the line connecting two modes, causing hallucinations by producing averaged outputs. In contrast, the stochastic noise in DDPM enables the trajectory to escape this trapping region. Empirical results confirm DDPM has lower hallucination rates, and adding stochastic steps to DDIM mitigates the issue. This provides a theoretical basis for why stochasticity helps in sampling from multimodal distributions.

Core claim

For a Gaussian mixture target distribution, after a critical time τ, the reverse ODE of DDIM can become stuck on the segment connecting the two nearest modes, leading to hallucinated samples that lie between modes, whereas the SDE of DDPM uses stochasticity to become unstuck from this region and avoid hallucination.

What carries the argument

The reverse ODE dynamics of DDIM and SDE dynamics of DDPM applied to a Gaussian mixture, with identification of a critical time τ where deterministic paths stick to inter-mode segments.

If this is right

DDPM exhibits a significantly lower hallucination rate than DDIM when trajectories enter the inter-mode region.
Incorporating additional stochastic steps into DDIM can prevent it from getting stuck and reduce hallucinations.
The analysis offers insights for designing samplers that better handle multimodal distributions by balancing determinism and stochasticity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The mechanism may apply to other deterministic samplers in diffusion models beyond DDIM.
Hybrid sampling strategies could be developed by switching to stochastic steps near the critical time τ.

Load-bearing premise

The sticking behavior and benefit of stochasticity are proven specifically for Gaussian mixture target distributions.

What would settle it

Running the DDIM reverse process on a two-component Gaussian mixture and checking if samples remain on the inter-mode segment after time τ, versus DDPM samples escaping it.

Figures

Figures reproduced from arXiv: 2605.06831 by Abhinav N. Harish, Grigorios G. Chrysos, Hung Yun Tseng, Ishaan Kharbanda, Muhammad H. Ashiq, Samanyu Arora.

**Figure 1.** Figure 1: (a) In 100,000 generated samples for a 25-mode Gaussian mixture target, despite using the same pretrained model, DDPM (left) hallucinates significantly less than DDIM (right). (b) Towards the beginning of the reverse process, the trajectory selects a line segment to converge to. After that, the trajectory converges rapidly to the nearest line segment: either the true mode or the midpoint neighborhood. (c) … view at source ↗

**Figure 2.** Figure 2: (1) In black, we have the line segment L (i,j) t joining two modes. (2) Together with the red portion, this forms L (i,j) t,ε . (3) We then have the ε-ball surrounding modes i and j. (4) Next, we have Tube(i,j) t,ε . (5) We also illustrate the midpoint of the line segment y ∗ t (where wt = 0), discussed in Prop. 4.7. This provides a high-level description of key objects used throughout Sec. 4, and is not i… view at source ↗

**Figure 3.** Figure 3: Hallucination rate for varying number of DDIM steps used in the reverse process. Notice that the number of DDIM interpolated samples is consistently larger than that of DDPM. Thus, this invalidates the idea that the gap between DDIM and DDPM hallucination rates arises due to skipping steps. interpolation is a primary source of hallucinations during sampling. We also demonstrate that the high hallucination … view at source ↗

**Figure 4.** Figure 4: For both DDIM (Figure 4a) and DDPM (Figure 4b), we plot the convergence rate to the nearest i, j-mode segment across 100,000 trajectories, finding that convergence occurs after τ1 and thus validating Theorem 4.2. Note that i, j change across time in these figures; however, as expected, after τ1 they become fixed. We plot ε/ϖ as a dotted black line, finding that convergence to Tube(i,j) t,ε is after τ2; thu… view at source ↗

**Figure 5.** Figure 5: Starting DDIM at τ3 = 9, we find that for ϑ = 0.15ℓt, DDIM gets stuck before it can reach the true modes, i.e., it hallucinates, as predicted by Prop. 4.7. Furthermore, DDPM has a lower hallucination rate within this same ϑ. Thus, we conclude that DDPM noise helps escape the ϑ-neighborhood around the midpoint, as predicted by Prop. 5.1. Given this, we find that adding z DDPM steps after starting DDIM at … view at source ↗

read the original abstract

We theoretically study the hallucination phenomena in two canonical diffusion samplers: the stochastic Denoising Diffusion Probabilistic Model (DDPM) and the deterministic Denoising Diffusion Implicit Model (DDIM). We analyze the reverse ODE (DDIM) and SDE (DDPM) for a Gaussian mixture target, proving that after a critical time $\tau$, (a) DDIM can become stuck on the segment connecting the two nearest modes and (b) DDPM *stochasticity* helps it become unstuck from this region, thus avoiding hallucination. Our empirical validation verifies that DDPM has a significantly lower hallucination rate than DDIM when this region is entered. Building on our observations, we exhibit how using additional stochastic steps can help DDIM avoid hallucinations and offer new insights on how to design improved samplers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proves DDIM reverse ODE trapping on the mode-connecting segment in a two-Gaussian mixture after time τ while DDPM escapes via noise, but this stays inside the toy setting.

read the letter

The central new piece is the explicit derivation showing that after critical time τ the DDIM ODE on a two-component Gaussian mixture locks onto the line segment between the modes, while the DDPM SDE term lets trajectories leave that region. They verify the hallucination-rate difference empirically when the trajectory enters the segment and note that extra stochastic steps can mitigate it for DDIM.

The analysis is straightforward: it works directly from the reverse ODE and SDE on the stated target without fitted parameters or self-reference. That makes the contrast between deterministic and stochastic dynamics concrete for this case.

The obvious limit is scope. Everything is derived and tested only for the two-mode Gaussian mixture. No argument shows the same local trapping is rate-limiting once the target has many modes, non-Gaussian components, or manifold support, and the empirical check is conditioned on already being in the bad region rather than measuring overall hallucination rates on realistic data.

This is useful for people who build or analyze diffusion samplers and want a mechanistic handle on deterministic versus stochastic behavior. The math is worth a referee's time to verify the derivation steps and τ definition, even if the practical payoff for complex models is still open.

Referee Report

2 major / 1 minor

Summary. The paper claims to theoretically analyze the reverse ODE (DDIM) and SDE (DDPM) dynamics on a Gaussian mixture target distribution. It proves that after a critical time τ, DDIM can become stuck on the line segment connecting the two nearest modes (causing hallucinations), while DDPM stochasticity allows escape from this region. Empirical validation shows DDPM has lower hallucination rates when this region is entered, and the work suggests adding stochastic steps to DDIM to mitigate hallucinations.

Significance. If the central claim holds, the work supplies a mechanistic explanation for differences in hallucination behavior between deterministic and stochastic diffusion samplers, which could inform improved sampler design. The exact analysis on a Gaussian mixture target is a strength, enabling a rigorous proof of the sticking phenomenon and the benefit of stochasticity; the empirical verification conditioned on the region is also a positive element.

major comments (2)

[theoretical analysis (reverse dynamics)] The proof of the sticking behavior after critical time τ (abstract) lacks the derivation steps, the explicit formula for τ, and the specific Gaussian mixture parameters, so the support for the central claim cannot be evaluated.
[empirical validation] The empirical validation verifies lower hallucination rates for DDPM only when conditioned on entering the inter-mode region; it does not demonstrate that segment-sticking is the dominant mechanism on high-dimensional or multi-modal data beyond the two-component Gaussian mixture.

minor comments (1)

The notation and definition of the critical time τ should be stated explicitly with an equation, even if the full derivation is in an appendix.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful review and constructive comments. We address each major comment below.

read point-by-point responses

Referee: [theoretical analysis (reverse dynamics)] The proof of the sticking behavior after critical time τ (abstract) lacks the derivation steps, the explicit formula for τ, and the specific Gaussian mixture parameters, so the support for the central claim cannot be evaluated.

Authors: We agree that the presentation of the proof requires greater explicitness for full evaluability. The derivation of the sticking behavior for the DDIM reverse ODE is contained in Section 3, with supporting steps in Appendix A; however, we will revise the manuscript to prominently display the explicit formula for the critical time τ (the time at which the velocity field aligns with the inter-mode segment) and to state the precise Gaussian mixture parameters (two components with means at ±e₁ and isotropic covariance σ²I). revision: yes
Referee: [empirical validation] The empirical validation verifies lower hallucination rates for DDPM only when conditioned on entering the inter-mode region; it does not demonstrate that segment-sticking is the dominant mechanism on high-dimensional or multi-modal data beyond the two-component Gaussian mixture.

Authors: The empirical section is deliberately conditioned on entry into the inter-mode region precisely to isolate and verify the mechanism predicted by the theory. The paper's scope is the rigorous analysis of this phenomenon on the two-component Gaussian mixture, which permits an exact proof; we make no claim that segment-sticking is the dominant mechanism in high-dimensional or more complex multi-modal settings. The identified mechanism nevertheless supplies design insight, as illustrated by the stochastic-step augmentation we propose. No revision is required. revision: no

Circularity Check

0 steps flagged

Direct analysis of reverse ODE/SDE for Gaussian mixture; self-contained derivation

full rationale

The paper states it analyzes the reverse ODE (DDIM) and SDE (DDPM) directly for a Gaussian mixture target, proving sticking behavior after time τ and the role of stochasticity. No steps reduce by construction to fitted inputs, self-citations, or renamed empirical patterns; the proof is presented as following from the stated target distribution and diffusion equations. Empirical checks are described as verification of the derived mechanism rather than its foundation. This is a standard non-circular theoretical derivation for the toy setting.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim depends on the choice of a Gaussian mixture as the target distribution and on the standard formulation of the DDIM ODE and DDPM SDE reverse processes; no free parameters or new entities are introduced in the abstract.

axioms (2)

domain assumption The data distribution is a Gaussian mixture.
The proof of sticking and escape is performed explicitly for this target class as stated in the abstract.
standard math The reverse process is the standard probability-flow ODE for DDIM and the corresponding SDE for DDPM.
The analysis invokes the canonical reverse dynamics of each sampler without additional derivation.

pith-pipeline@v0.9.1-grok · 5696 in / 1342 out tokens · 29505 ms · 2026-06-30T23:09:02.375903+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 6 canonical work pages · 1 internal anchor

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...
[2]

K., Maini, P., Lipton, Z

Aithal, S. K., Maini, P., Lipton, Z. C., and Kolter, J. Z. Understanding hallucinations in diffusion models through mode interpolation. In Advances in Neural Information Processing Systems (NeurIPS), 2024

2024
[3]

Anderson, B. D. Reverse-time diffusion equation models. Stochastic Processes and their Applications, 1982

1982
[4]

Hallucination early detection in diffusion models

Betti, F., Baraldi, L., Baraldi, L., Cucchiara, R., and Sebe, N. Hallucination early detection in diffusion models. International Journal of Computer Vision (IJCV), 2026

2026
[5]

and Bach, F

Beyler, E. and Bach, F. Convergence of deterministic and stochastic diffusion-model samplers: A simple analysis in Wasserstein distance. arXiv 2508.03210, 2025

work page arXiv 2025
[6]

Dynamical regimes of diffusion models

Biroli, G., Bonnaire, T., de Bortoli, V., and M \'e zard, M. Dynamical regimes of diffusion models. In Nature Communications, 2024

2024
[7]

Why diffusion models don t memorize: The role of implicit dynamical regularization in training

Bonnaire, T., Urfin, R., Biroli, G., and Mezard, M. Why diffusion models don t memorize: The role of implicit dynamical regularization in training. In Advances in Neural Information Processing Systems (NeurIPS), 2025

2025
[8]

Buchanan, S., Pai, D., Ma, Y., and Bortoli, V. D. On the edge of memorization in diffusion models. In Advances in Neural Information Processing Systems (NeurIPS), 2025

2025
[9]

and Li, G

Cai, C. and Li, G. Minimax optimality of the probability flow ODE for diffusion models. arXiv 2503.09583, 2025

work page arXiv 2025
[10]

Exploring the optimal choice for generative processes in diffusion models: Ordinary vs stochastic differential equations

Cao, Y., Chen, J., Luo, Y., and Zhou, X. Exploring the optimal choice for generative processes in diffusion models: Ordinary vs stochastic differential equations. In Advances in Neural Information Processing Systems (NeurIPS), 2023

2023
[11]

Laplacian score sharpening for mitigating hallucination in diffusion models

Chandran.C, B., Anumasa, S., and Liu, D. Laplacian score sharpening for mitigating hallucination in diffusion models. arXiv 2511.07496, 2025

work page arXiv 2025
[12]

The probability flow ODE is provably fast

Chen, S., Chewi, S., Lee, H., Li, Y., Lu, J., and Salim, A. The probability flow ODE is provably fast. In Advances in Neural Information Processing Systems (NeurIPS), 2023

2023
[13]

Going beyond compositions, DDPMs can produce zero-shot interpolations

Deschenaux, J., Krawczuk, I., Chrysos, G., and Cevher, V. Going beyond compositions, DDPMs can produce zero-shot interpolations. In International Conference on Machine Learning (ICML), 2024

2024
[14]

M., Budd, C., and Sch \"o nlieb, C.-B

Deveney, T., Stanczuk, J., Kreusser, L. M., Budd, C., and Sch \"o nlieb, C.-B. Closing the ODE - SDE gap in score-based diffusion models through the Fokker-Planck equation. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2025

2025
[15]

Dubins, L. E. and Schwarz, G. On continuous martingales. Proceedings of the National Academy of Sciences of the United States of America, 1965

1965
[16]

A., Liu, X., Zeng, Z., Ma, L., Zhang, Q., and Wu, Q

Fu, S., Zhou, J., Chen, Q., Jing, H., Nguyen, H. A., Liu, X., Zeng, Z., Ma, L., Zhang, Q., and Wu, Q. Counting hallucinations in diffusion models. arXiv 2510.13080, 2025

work page arXiv 2025
[17]

Multilinear latent conditioning for generating unseen attribute combinations

Georgopoulos, M., Chrysos, G., Pantic, M., and Panagakis, Y. Multilinear latent conditioning for generating unseen attribute combinations. In International Conference on Machine Learning (ICML), 2020

2020
[18]

Denoising diffusion probabilistic models

Ho, J., Jain, A., and Abbeel, P. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems (NeurIPS), 2020

2020
[19]

Kingma, D. P. and Ba, J. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), 2015

2015
[20]

and Massart, P

Laurent, B. and Massart, P. Adaptive estimation of a quadratic functional by model selection. Annals of Statistics, 2000

2000
[21]

Dimension-free convergence of diffusion models for approximate Gaussian mixtures

Li, G., Cai, C., and Wei, Y. Dimension-free convergence of diffusion models for approximate Gaussian mixtures. arXiv 2504.05300, 2025

work page arXiv 2025
[22]

and Chen, S

Li, M. and Chen, S. Critical windows: Non-asymptotic theory for feature emergence in diffusion models. In International Conference on Machine Learning (ICML), 2024

2024
[23]

Unraveling the smoothness properties of diffusion models: A Gaussian mixture perspective

Liang, Y., Shi, Z., Song, Z., and Zhou, Y. Unraveling the smoothness properties of diffusion models: A Gaussian mixture perspective. International Conference on Computer Vision (ICCV), 2025

2025
[24]

DPM -solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps

Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., and Zhu, J. DPM -solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. In Advances in Neural Information Processing Systems (NeurIPS), 2022

2022
[25]

Towards understanding text hallucination of diffusion models via local generation bias

Lu, R., Wang, R., Lyu, K., Jiang, X., Huang, G., and Wang, M. Towards understanding text hallucination of diffusion models via local generation bias. In International Conference on Learning Representations (ICLR), 2025

2025
[26]

and Peres, Y

Mörters, P. and Peres, Y. Brownian Motion. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2010

2010
[27]

High-resolution image synthesis with latent diffusion models

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. High-resolution image synthesis with latent diffusion models. In Conference on Computer Vision and Pattern Recognition (CVPR), 2022

2022
[28]

and Hyv \"a rinen, A

Saremi, S. and Hyv \"a rinen, A. Neural empirical Bayes . Journal of Machine Learning Research, 2019

2019
[29]

Learning mixtures of Gaussians using the DDPM objective

Shah, K., Chen, S., and Klivans, A. Learning mixtures of Gaussians using the DDPM objective. In Advances in Neural Information Processing Systems (NeurIPS), 2023

2023
[30]

Improved sampling of diffusion models in fluid dynamics with Tweedie 's formula

Shehata, Y., Holzschuh, B., and Thuerey, N. Improved sampling of diffusion models in fluid dynamics with Tweedie 's formula. In International Conference on Learning Representations (ICLR), 2025

2025
[31]

A., Maheswaranathan, N., and Ganguli, S

Sohl-Dickstein, J., Weiss, E. A., Maheswaranathan, N., and Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning (ICML), 2015

2015
[32]

Understanding and mitigating copying in diffusion models

Somepalli, G., Singla, V., Goldblum, M., Geiping, J., and Goldstein, T. Understanding and mitigating copying in diffusion models. In Advances in Neural Information Processing Systems (NeurIPS), 2023

2023
[33]

Denoising diffusion implicit models

Song, J., Meng, C., and Ermon, S. Denoising diffusion implicit models. In International Conference on Learning Representations (ICLR), 2021 a

2021
[34]

P., Kumar, A., Ermon, S., and Poole, B

Song, Y., Sohl-Dickstein, J., Kingma, D. P., Kumar, A., Ermon, S., and Poole, B. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations (ICLR), 2021 b

2021
[35]

and Tran, T

Thanh-Tung, H. and Tran, T. Catastrophic forgetting and mode collapse in GANs . In International Joint Conference on Neural Networks (IJCNN), 2020

2020
[36]

Mitigating Diffusion Model Hallucinations with Dynamic Guidance

Triaridis, K., Graikos, A., Chatziagapi, A., Chrysos, G. G., and Samaras, D. Mitigating diffusion model hallucinations with dynamic guidance. arXiv 2510.05356, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[37]

Generative models of visually grounded imagination

Vedantam, R., Fischer, I., Huang, J., and Murphy, K. Generative models of visually grounded imagination. In International Conference on Learning Representations (ICLR), 2018

2018
[38]

Theoretical insights for diffusion guidance: A case study for Gaussian mixture models

Wu, Y., Chen, M., Li, Z., Wang, M., and Wei, Y. Theoretical insights for diffusion guidance: A case study for Gaussian mixture models. In International Conference on Machine Learning (ICML), 2024

2024
[39]

Xu, Y., Deng, M., Cheng, X., Tian, Y., Liu, Z., and Jaakkola, T. S. Restart sampling for improving generative processes. In Advances in Neural Information Processing Systems (NeurIPS), 2023

2023
[40]

On the convergence and mode collapse of GAN

Zhang, Z., Li, M., and Yu, J. On the convergence and mode collapse of GAN . SIGGRAPH Asia Technical Briefs, 2018

2018
[41]

Generalization of diffusion models arises with a balanced representation space

Zhang, Z., Li, X., Li, X., Shi, L., Wu, M., Tao, M., and Qu, Q. Generalization of diffusion models arises with a balanced representation space. In International Conference on Learning Representations (ICLR), 2026

2026
[42]

Bias and generalization in deep generative models: An empirical study

Zhao, S., Ren, H., Yuan, A., Song, J., Goodman, N., and Ermon, S. Bias and generalization in deep generative models: An empirical study. In Advances in Neural Information Processing Systems (NeurIPS), 2018

2018

[1] [1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

[2] [2]

K., Maini, P., Lipton, Z

Aithal, S. K., Maini, P., Lipton, Z. C., and Kolter, J. Z. Understanding hallucinations in diffusion models through mode interpolation. In Advances in Neural Information Processing Systems (NeurIPS), 2024

2024

[3] [3]

Anderson, B. D. Reverse-time diffusion equation models. Stochastic Processes and their Applications, 1982

1982

[4] [4]

Hallucination early detection in diffusion models

Betti, F., Baraldi, L., Baraldi, L., Cucchiara, R., and Sebe, N. Hallucination early detection in diffusion models. International Journal of Computer Vision (IJCV), 2026

2026

[5] [5]

and Bach, F

Beyler, E. and Bach, F. Convergence of deterministic and stochastic diffusion-model samplers: A simple analysis in Wasserstein distance. arXiv 2508.03210, 2025

work page arXiv 2025

[6] [6]

Dynamical regimes of diffusion models

Biroli, G., Bonnaire, T., de Bortoli, V., and M \'e zard, M. Dynamical regimes of diffusion models. In Nature Communications, 2024

2024

[7] [7]

Why diffusion models don t memorize: The role of implicit dynamical regularization in training

Bonnaire, T., Urfin, R., Biroli, G., and Mezard, M. Why diffusion models don t memorize: The role of implicit dynamical regularization in training. In Advances in Neural Information Processing Systems (NeurIPS), 2025

2025

[8] [8]

Buchanan, S., Pai, D., Ma, Y., and Bortoli, V. D. On the edge of memorization in diffusion models. In Advances in Neural Information Processing Systems (NeurIPS), 2025

2025

[9] [9]

and Li, G

Cai, C. and Li, G. Minimax optimality of the probability flow ODE for diffusion models. arXiv 2503.09583, 2025

work page arXiv 2025

[10] [10]

Exploring the optimal choice for generative processes in diffusion models: Ordinary vs stochastic differential equations

Cao, Y., Chen, J., Luo, Y., and Zhou, X. Exploring the optimal choice for generative processes in diffusion models: Ordinary vs stochastic differential equations. In Advances in Neural Information Processing Systems (NeurIPS), 2023

2023

[11] [11]

Laplacian score sharpening for mitigating hallucination in diffusion models

Chandran.C, B., Anumasa, S., and Liu, D. Laplacian score sharpening for mitigating hallucination in diffusion models. arXiv 2511.07496, 2025

work page arXiv 2025

[12] [12]

The probability flow ODE is provably fast

Chen, S., Chewi, S., Lee, H., Li, Y., Lu, J., and Salim, A. The probability flow ODE is provably fast. In Advances in Neural Information Processing Systems (NeurIPS), 2023

2023

[13] [13]

Going beyond compositions, DDPMs can produce zero-shot interpolations

Deschenaux, J., Krawczuk, I., Chrysos, G., and Cevher, V. Going beyond compositions, DDPMs can produce zero-shot interpolations. In International Conference on Machine Learning (ICML), 2024

2024

[14] [14]

M., Budd, C., and Sch \"o nlieb, C.-B

Deveney, T., Stanczuk, J., Kreusser, L. M., Budd, C., and Sch \"o nlieb, C.-B. Closing the ODE - SDE gap in score-based diffusion models through the Fokker-Planck equation. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2025

2025

[15] [15]

Dubins, L. E. and Schwarz, G. On continuous martingales. Proceedings of the National Academy of Sciences of the United States of America, 1965

1965

[16] [16]

A., Liu, X., Zeng, Z., Ma, L., Zhang, Q., and Wu, Q

Fu, S., Zhou, J., Chen, Q., Jing, H., Nguyen, H. A., Liu, X., Zeng, Z., Ma, L., Zhang, Q., and Wu, Q. Counting hallucinations in diffusion models. arXiv 2510.13080, 2025

work page arXiv 2025

[17] [17]

Multilinear latent conditioning for generating unseen attribute combinations

Georgopoulos, M., Chrysos, G., Pantic, M., and Panagakis, Y. Multilinear latent conditioning for generating unseen attribute combinations. In International Conference on Machine Learning (ICML), 2020

2020

[18] [18]

Denoising diffusion probabilistic models

Ho, J., Jain, A., and Abbeel, P. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems (NeurIPS), 2020

2020

[19] [19]

Kingma, D. P. and Ba, J. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), 2015

2015

[20] [20]

and Massart, P

Laurent, B. and Massart, P. Adaptive estimation of a quadratic functional by model selection. Annals of Statistics, 2000

2000

[21] [21]

Dimension-free convergence of diffusion models for approximate Gaussian mixtures

Li, G., Cai, C., and Wei, Y. Dimension-free convergence of diffusion models for approximate Gaussian mixtures. arXiv 2504.05300, 2025

work page arXiv 2025

[22] [22]

and Chen, S

Li, M. and Chen, S. Critical windows: Non-asymptotic theory for feature emergence in diffusion models. In International Conference on Machine Learning (ICML), 2024

2024

[23] [23]

Unraveling the smoothness properties of diffusion models: A Gaussian mixture perspective

Liang, Y., Shi, Z., Song, Z., and Zhou, Y. Unraveling the smoothness properties of diffusion models: A Gaussian mixture perspective. International Conference on Computer Vision (ICCV), 2025

2025

[24] [24]

DPM -solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps

Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., and Zhu, J. DPM -solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. In Advances in Neural Information Processing Systems (NeurIPS), 2022

2022

[25] [25]

Towards understanding text hallucination of diffusion models via local generation bias

Lu, R., Wang, R., Lyu, K., Jiang, X., Huang, G., and Wang, M. Towards understanding text hallucination of diffusion models via local generation bias. In International Conference on Learning Representations (ICLR), 2025

2025

[26] [26]

and Peres, Y

Mörters, P. and Peres, Y. Brownian Motion. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2010

2010

[27] [27]

High-resolution image synthesis with latent diffusion models

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. High-resolution image synthesis with latent diffusion models. In Conference on Computer Vision and Pattern Recognition (CVPR), 2022

2022

[28] [28]

and Hyv \"a rinen, A

Saremi, S. and Hyv \"a rinen, A. Neural empirical Bayes . Journal of Machine Learning Research, 2019

2019

[29] [29]

Learning mixtures of Gaussians using the DDPM objective

Shah, K., Chen, S., and Klivans, A. Learning mixtures of Gaussians using the DDPM objective. In Advances in Neural Information Processing Systems (NeurIPS), 2023

2023

[30] [30]

Improved sampling of diffusion models in fluid dynamics with Tweedie 's formula

Shehata, Y., Holzschuh, B., and Thuerey, N. Improved sampling of diffusion models in fluid dynamics with Tweedie 's formula. In International Conference on Learning Representations (ICLR), 2025

2025

[31] [31]

A., Maheswaranathan, N., and Ganguli, S

Sohl-Dickstein, J., Weiss, E. A., Maheswaranathan, N., and Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning (ICML), 2015

2015

[32] [32]

Understanding and mitigating copying in diffusion models

Somepalli, G., Singla, V., Goldblum, M., Geiping, J., and Goldstein, T. Understanding and mitigating copying in diffusion models. In Advances in Neural Information Processing Systems (NeurIPS), 2023

2023

[33] [33]

Denoising diffusion implicit models

Song, J., Meng, C., and Ermon, S. Denoising diffusion implicit models. In International Conference on Learning Representations (ICLR), 2021 a

2021

[34] [34]

P., Kumar, A., Ermon, S., and Poole, B

Song, Y., Sohl-Dickstein, J., Kingma, D. P., Kumar, A., Ermon, S., and Poole, B. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations (ICLR), 2021 b

2021

[35] [35]

and Tran, T

Thanh-Tung, H. and Tran, T. Catastrophic forgetting and mode collapse in GANs . In International Joint Conference on Neural Networks (IJCNN), 2020

2020

[36] [36]

Mitigating Diffusion Model Hallucinations with Dynamic Guidance

Triaridis, K., Graikos, A., Chatziagapi, A., Chrysos, G. G., and Samaras, D. Mitigating diffusion model hallucinations with dynamic guidance. arXiv 2510.05356, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[37] [37]

Generative models of visually grounded imagination

Vedantam, R., Fischer, I., Huang, J., and Murphy, K. Generative models of visually grounded imagination. In International Conference on Learning Representations (ICLR), 2018

2018

[38] [38]

Theoretical insights for diffusion guidance: A case study for Gaussian mixture models

Wu, Y., Chen, M., Li, Z., Wang, M., and Wei, Y. Theoretical insights for diffusion guidance: A case study for Gaussian mixture models. In International Conference on Machine Learning (ICML), 2024

2024

[39] [39]

Xu, Y., Deng, M., Cheng, X., Tian, Y., Liu, Z., and Jaakkola, T. S. Restart sampling for improving generative processes. In Advances in Neural Information Processing Systems (NeurIPS), 2023

2023

[40] [40]

On the convergence and mode collapse of GAN

Zhang, Z., Li, M., and Yu, J. On the convergence and mode collapse of GAN . SIGGRAPH Asia Technical Briefs, 2018

2018

[41] [41]

Generalization of diffusion models arises with a balanced representation space

Zhang, Z., Li, X., Li, X., Shi, L., Wu, M., Tao, M., and Qu, Q. Generalization of diffusion models arises with a balanced representation space. In International Conference on Learning Representations (ICLR), 2026

2026

[42] [42]

Bias and generalization in deep generative models: An empirical study

Zhao, S., Ren, H., Yuan, A., Song, J., Goodman, N., and Ermon, S. Bias and generalization in deep generative models: An empirical study. In Advances in Neural Information Processing Systems (NeurIPS), 2018

2018