Tweedie's Formulae and Diffusion Generative Models Beyond Gaussian

Nizar Touzi; Wenpin Tang; Xun Yu Zhou; Zikun Zhang

arxiv: 2605.19391 · v1 · pith:LSOIWBNFnew · submitted 2026-05-19 · 📊 stat.ML · cs.LG

Tweedie's Formulae and Diffusion Generative Models Beyond Gaussian

Wenpin Tang , Nizar Touzi , Zikun Zhang , Xun Yu Zhou This is my paper

Pith reviewed 2026-05-20 03:15 UTC · model grok-4.3

classification 📊 stat.ML cs.LG

keywords Tweedie's formuladiffusion modelsnon-Gaussian processesgeometric Brownian motionCox-Ingersoll-Rosssquared Bessel processdenoising score matchinggenerative models

0 comments

The pith

Tweedie's formula extends to geometric Brownian motion, squared Bessel, and Cox-Ingersoll-Ross processes, enabling denoising score matching for non-Gaussian diffusion models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends Tweedie's formula, which relates the score function to conditional expectations during denoising, from the usual Gaussian setting to three non-Gaussian diffusions. Explicit formulae are derived for geometric Brownian motion, squared Bessel processes, and Cox-Ingersoll-Ross processes, each producing a concrete denoising score-matching objective. These objectives are applied to train generative models on images and financial time series and to empirical Bayes estimation under the squared Bessel setting. A sympathetic reader would care because current diffusion models almost all rely on additive Gaussian noise; removing that restriction could let models respect domain-specific constraints such as positivity or volatility clustering.

Core claim

We extend Tweedie's formula to the geometric Brownian motion, squared Bessel, and Cox-Ingersoll-Ross processes. The resulting identities express the score function of the perturbed data in terms of the conditional expectation of the clean data under the respective process law, thereby supplying explicit denoising score-matching losses that can be minimized to learn the reverse diffusion.

What carries the argument

Extended Tweedie's formulae for GBM, BESQ and CIR that give the score as the gradient of the log-transition density expressed via conditional expectations under each process.

If this is right

GBM-based diffusion models become trainable for image generation via the corresponding score-matching loss.
CIR-based models can be trained for financial time-series generation.
BESQ processes admit empirical Bayes estimation through the derived formula.
Diffusion models with state-dependent diffusion coefficients become practical alternatives to Gaussian ones.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Models built on these processes may automatically respect positivity constraints common in prices or intensities without post-processing.
The same derivation route could be applied to other diffusions whose transition densities or conditional expectations are known in closed form.
Empirical comparisons on data with strong mean-reversion or multiplicative noise would test whether the non-Gaussian choice improves sample quality over standard Gaussian diffusion.

Load-bearing premise

The derived formulae for GBM, BESQ and CIR produce denoising score-matching objectives that can be successfully optimized and yield useful generative performance.

What would settle it

Training a GBM- or CIR-based diffusion model on a known target distribution using the derived score-matching objective and finding that the generated samples systematically fail to match the target statistics would falsify the claim that the extension supplies workable objectives.

Figures

Figures reproduced from arXiv: 2605.19391 by Nizar Touzi, Wenpin Tang, Xun Yu Zhou, Zikun Zhang.

**Figure 1.** Figure 1: Real and Generated MNIST Images (a) Preprocessed Dataset (b) GBM-Based Samples (c) VE-Based Samples (d) CIR-Based Samples (e) VP-Based Samples [PITH_FULL_IMAGE:figures/full_fig_p016_1.png] view at source ↗

**Figure 2.** Figure 2: Real and Generated 64-Day Sum Log-Returns of Three Portfolios (a) GBM-Based Samples (b) VP-Based Samples in terms of mean and volatility than the VP-based model, particularly for the Equal-Weight and Risk-Parity portfolios. Both models exhibit a modest mismatch with the real data in the tail regions, which is largely attributable to the high sensitivity of financial time series to numerical errors inherent… view at source ↗

**Figure 3.** Figure 3: Gamma Example (a) Histogram of 5000 zi ’s (b) Natural Spline Fit with 10 Degrees of Freedom We randomly sample N = 5000 values of ui from Gamma distribution Γ(12, 10). Figure 3a shows the frequency of 5000 generated zi ’s, where there are 63 bins. Denote the center of the k-th bin by xk and the corresponding bar height by yk for k = 1, . . . , 63. Figure 3b shows log yk against xk, with bins satisfying yk… view at source ↗

**Figure 4.** Figure 4: Empirical Bayes Estimation Curves for BESQ effectiveness of the resulting denoising score-matching methods. We also apply Tweedie’s formula under the BESQ framework to estimate the noncentrality parameter from the noncentral chi-squared noise for empirical Bayes estimation. To sum, we extend original Tweedie’s formula to non-Gaussian processes and showcase the promise of non-Gaussian diffusion models for… view at source ↗

**Figure 5.** Figure 5: Exponential Example (a) Histogram of 5000 zi ’s (b) Natural Spline Fit with 10 Degrees of Freedom (σ = 0.1) Following the experimental setting in [15], we set N = 5000 ui values as 10 repetitions each of ui = log log 500 i − 0.5 , i = 1, · · · , 500. The empirical distribution of e ui closely matches an exponential distribution with rate 1. Then zi is generated via LogNormal(ui , σ2 ) for each i. Here, we… view at source ↗

**Figure 6.** Figure 6: Empirical Bayes Estimation Curves (σ = 0.1) (a) GBM model in z-space (b) BM model in (log z)-space [PITH_FULL_IMAGE:figures/full_fig_p022_6.png] view at source ↗

**Figure 7.** Figure 7: Empirical Bayes Estimation Curves (σ = 0.5) (a) GBM model in z-space (b) BM model in (log z)-space [PITH_FULL_IMAGE:figures/full_fig_p023_7.png] view at source ↗

**Figure 8.** Figure 8: Empirical Bayes Estimation Curves (σ = 1.0) (a) GBM model in z-space (b) BM model in (log z)-space We also consider directly applying Tweedie’s formula in the (log z)-space to estimate ui . Let ˜zi = log zi for all i ∈ [N]. Then, equivalently, we observe: z˜i ∼ N (ui , σ2 ), i = 1, · · · , N, and aim to estimate ui . Therefore, we can apply Tweedie’s formula for the Gaussian case in the ˜z-space to obtain … view at source ↗

read the original abstract

Diffusion models have achieved remarkable success in generating samples from unknown data distributions. Most popular stochastic differential equation-based diffusion models perturb the target distribution by adding Gaussian noise, transforming it into a simple prior, and then use denoising score matching, a consequence of Tweedie's formula, to learn the score function and generate clean samples from noise. However, non-Gaussian diffusion models with state-dependent diffusion coefficient have been largely underexplored, as have the corresponding Tweedie's formulae. In this work, we extend Tweedie's formula to important non-Gaussian processes, including geometric Brownian motion (GBM), squared Bessel (BESQ) processes, and Cox-Ingersoll-Ross (CIR) processes, thereby yielding the corresponding denoising score-matching objectives. We then apply the derived formulae to image and financial time series generation using GBM- and CIR-based diffusion models, and to empirical Bayes estimation under the BESQ setting. The reported experimental results demonstrate the potential of non-Gaussian models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper derives Tweedie's formulae for GBM, BESQ, and CIR to set up non-Gaussian denoising objectives and tests them on generation and estimation tasks.

read the letter

The main point is that they extend Tweedie's formula to geometric Brownian motion, squared Bessel processes, and CIR, then use the results to write denoising score-matching losses for diffusion models whose noise has state-dependent volatility. They apply this to image generation with GBM, financial time-series with CIR, and empirical Bayes estimation under BESQ, and the experiments indicate these models can produce reasonable samples. That is the concrete advance over the usual Gaussian setup. The derivations appear to come directly from the infinitesimal generators and known transition densities of each process, which is the right way to do it. The applications to finance and time series are a natural fit and give the work some practical grounding. The soft spot is whether the new formulae reduce cleanly to the classical Gaussian Tweedie identity when the diffusion coefficient is made constant; if that check is missing or the boundary terms are mishandled, the objectives could be incorrect for the target SDEs. The experiments would also benefit from tighter controls showing that the non-Gaussian choice actually improves performance rather than just matching it. This is aimed at researchers who already work with diffusion models or stochastic processes and want to move beyond additive Gaussian noise. It is solid enough to deserve a serious referee, even though the derivations will need close scrutiny in review.

Referee Report

2 major / 2 minor

Summary. The paper extends Tweedie's formula to non-Gaussian diffusion processes including geometric Brownian motion (GBM), squared Bessel (BESQ) processes, and Cox-Ingersoll-Ross (CIR) processes. These extensions produce corresponding denoising score-matching objectives. The authors apply the resulting objectives to train GBM- and CIR-based diffusion models for image generation and financial time-series generation, and to empirical Bayes estimation in the BESQ setting. Experimental results are reported to illustrate the potential of such non-Gaussian models.

Significance. If the derivations hold, the work is significant because it supplies explicit Tweedie-type identities and score-matching losses for processes whose diffusion coefficients depend on state, which are natural in finance and other domains. The manuscript provides closed-form expressions that generalize the Gaussian case and directly yield trainable objectives, together with reproducible experiments on both images and time series. This combination of derivation and application strengthens the case for exploring non-Gaussian diffusions.

major comments (2)

[§3.2] §3.2, Eq. (12) (GBM Tweedie identity): the derivation does not explicitly verify reduction to the classical Gaussian Tweedie formula when the volatility parameter is taken to zero while keeping the drift fixed; without this limit check the generalization to state-dependent diffusion remains unconfirmed.
[§4.1] §4.1, the infinitesimal-generator step for CIR: the boundary behavior at zero for the CIR process is not addressed when relating the conditional expectation to the score term; this is load-bearing because the generator contains a state-dependent term that vanishes at the boundary.

minor comments (2)

Notation for the score function is introduced inconsistently between the GBM and BESQ sections; a single definition table would improve readability.
Figure 3 caption does not state the number of independent runs or the error bars shown; this affects interpretation of the reported FID and likelihood values.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive summary and recommendation for major revision. We address the two major comments point by point below, agreeing to incorporate clarifications and verifications in the revised manuscript.

read point-by-point responses

Referee: [§3.2] §3.2, Eq. (12) (GBM Tweedie identity): the derivation does not explicitly verify reduction to the classical Gaussian Tweedie formula when the volatility parameter is taken to zero while keeping the drift fixed; without this limit check the generalization to state-dependent diffusion remains unconfirmed.

Authors: We agree that an explicit verification of the limit would strengthen the presentation. In the revised manuscript, we will add a paragraph in Section 3.2 demonstrating that as the volatility parameter σ approaches 0 with the drift fixed, the GBM Tweedie identity in Eq. (12) reduces to the classical Gaussian Tweedie's formula. This limit check confirms the consistency of our generalization. revision: yes
Referee: [§4.1] §4.1, the infinitesimal-generator step for CIR: the boundary behavior at zero for the CIR process is not addressed when relating the conditional expectation to the score term; this is load-bearing because the generator contains a state-dependent term that vanishes at the boundary.

Authors: We appreciate this observation on the boundary behavior. The CIR process under the Feller condition (2κθ > σ²) does not reach the zero boundary with probability one, allowing the infinitesimal generator to be applied in the interior. We will revise Section 4.1 to explicitly mention this assumption and clarify that the relation between the conditional expectation and the score term holds away from the boundary. A note on the boundary conditions will be added for completeness. revision: yes

Circularity Check

0 steps flagged

Derivations of Tweedie's formulae for GBM/BESQ/CIR are independent mathematical extensions with no reduction to inputs by construction.

full rationale

The paper presents explicit derivations of Tweedie's formulae for the listed non-Gaussian processes by applying the Markov property and known transition densities to the infinitesimal generators of GBM, BESQ, and CIR SDEs. These steps produce denoising score-matching objectives as direct consequences of the conditional expectations, without any fitted parameters being relabeled as predictions or any self-referential definitions. No load-bearing claim reduces to a self-citation chain; the central results stand on the process definitions and standard stochastic calculus identities. Experiments then optimize the resulting objectives on image and time-series data, confirming the derivations are self-contained against external benchmarks rather than tautological.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are described. The work appears to rely on standard stochastic process theory for GBM, BESQ, and CIR without introducing new entities.

pith-pipeline@v0.9.0 · 5701 in / 1161 out tokens · 64783 ms · 2026-05-20T03:15:42.760948+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

σ²(t,x)∇log p(t,x) + 2σ(t,x)∂xσ(t,x) = b(t,x) + lim ε→0 (1/ε)E(X_{t-ε}-X_t | X_t=x) (Prop. 2.3); Tweedie formulae for GBM (3.4), BESQ (3.10), CIR (3.13)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

74 extracted references · 74 canonical work pages · 7 internal anchors

[1]

Aghapour, E

A. Aghapour, E. Bayraktar, and F. Yuan. Solving dynamic portfolio selection problems via score-based diffusion models. 2025. arXiv:2507.09916

work page arXiv 2025
[2]

B. D. O. Anderson. Reverse-time diffusion equation models.Stochastic Process. Appl., 12(3):313–326, 1982

work page 1982
[3]

Avdeyev, C

P. Avdeyev, C. Shi, Y. Tan, K. Dudnyk, and J. Zhou. Dirichlet diffusion score model for biological sequence generation. InICML, pages 1276–1301, 2023

work page 2023
[4]

Benton, V

J. Benton, V. D. Bortoli, A. Doucet, and G. Deligiannidis. Nearlyd-linear convergence bounds for diffusion models via stochastic localization. InICLR, 2024

work page 2024
[5]

Flux.2: Frontier visual intelligence.https://bfl.ai/blog/flux-2, 2025

Black Forest Labs. Flux.2: Frontier visual intelligence.https://bfl.ai/blog/flux-2, 2025

work page 2025
[6]

H. Chen, H. Lee, and J. Lu. Improved analysis of score-based generative modeling: User-friendly bounds under minimal smoothness assumptions. InICML, pages 4735–4763, 2023

work page 2023
[7]

S. Chen, S. Chewi, J. Li, Y. Li, A. Salim, and A. Zhang. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. InICLR, 2023

work page 2023
[8]

J. C. Cox. The constant elasticity of variance option pricing model.J. Portf. Manag., page 15, 1996

work page 1996
[9]

J. C. Cox, J. E. Ingersoll, and S. A. Ross. A theory of the term structure of interest rates.Econometrica, 53(2):385–407, 1985

work page 1985
[10]

D. Dai, J. Fan, Y. Gu, and D. Mukherjee. Cindes: Classification induced neural density estimator and simulator. 2025. arXiv:2510.00367

work page arXiv 2025
[11]

Delbaen and H

F. Delbaen and H. Shirakawa. A note on option pricing for the constant elasticity of variance model. Asia-Pac. Financ. Mark., 9:85–99, 2002. [12]Digital Library of Mathematical Functions.http://dlmf.nist.gov/, Release 1.2.4 of 2025-03-15

work page 2002
[12]

L. E. Dubins and G. Schwarz. On continuous martingales.Proc. Natl. Acad. Sci., 53(5):913–916, 1965. 25

work page 1965
[13]

B. Efron. Microarrays, empirical Bayes and the two-groups model.Stat. Sci., 2008

work page 2008
[14]

B. Efron. Tweedie’s formula and selection bias.J. Amer. Statist. Assoc., 106(496):1602–1614, 2011

work page 2011
[15]

Efron.Large-scale inference: empirical Bayes methods for estimation, testing, and prediction

B. Efron.Large-scale inference: empirical Bayes methods for estimation, testing, and prediction. Cam- bridge University Press, 2012

work page 2012
[16]

Efron and N

B. Efron and N. R. Zhang. False discovery rates and copy number variation.Biometrika, 98(2):251–271, 2011

work page 2011
[17]

D. C. Emanuel and J. D. MacBeth. Further results on the constant elasticity of variance call option pricing model.J. Financ. Quant. Anal., 17(4):533–554, 1982

work page 1982
[18]

J. Fan, Y. Gu, and X. Li. Optimal estimation of a factorizable density using diffusion models with relu neural networks. 2025. arXiv:2510.03994

work page arXiv 2025
[19]

Fitzsimmons, J

P. Fitzsimmons, J. Pitman, and M. Yor. Markovian bridges: construction, Palm interpretation, and splicing. InSeminar on Stochastic Processes, 1992 (Seattle, WA, 1992), volume 33 ofProgr. Probab., pages 101–134. Birkh¨ auser Boston, Boston, MA, 1993

work page 1992
[20]

Floto, T

G. Floto, T. Jonsson, M. Nica, S. Sanner, and E. Z. Zhu. Diffusion on the probability simplex. 2023. arXiv:2309.02530

work page arXiv 2023
[21]

X. Gao, J. Zha, and X. Y. Zhou. Data-driven generative simulation of SDEs using diffusion models. 2025. arXiv:2509.08731

work page arXiv 2025
[22]

Y. Gao, H. Guo, T. Hoang, W. Huang, L. Jiang, F. Kong, H. Li, J. Li, L. Li, and X. Li. Seedance 1.0: Exploring the boundaries of video generation models. 2025. arXiv:2506.09113

work page internal anchor Pith review Pith/arXiv arXiv 2025
[23]

State-of-the-art video and image generation with Veo 2 and Imagen 3.https://blog.google/ technology/google-labs/video-image-generation-update-december-2024/, 2024

Google. State-of-the-art video and image generation with Veo 2 and Imagen 3.https://blog.google/ technology/google-labs/video-image-generation-update-december-2024/, 2024

work page 2024
[24]

I. S. Gradshteyn and I. M. Ryzhik.Table of integrals, series, and products. Elsevier/Academic Press, Amsterdam, eighth edition, 2015

work page 2015
[25]

Gu and R

J. Gu and R. Koenker. Unobserved heterogeneity in income dynamics: An empirical Bayes perspective. J. Bus. Econom. Statist., 35(1):1–16, 2017

work page 2017
[26]

Z. Guo, J. Li, W. Tang, and D. D. Yao. Diffusion generative models meet compressed sensing, with applications to imaging and finance. 2025. arXiv:2509.03898

work page arXiv 2025
[27]

Z. Guo, W. Tang, and R. Xu. Conditional diffusion guidance under hard constraint: a stochastic analysis approach. 2026. arXiv:2602.05533

work page arXiv 2026
[28]

U. G. Haussmann and E. Pardoux. Time reversal of diffusions.Ann. Probab., 14(4):1188–1205, 1986

work page 1986
[29]

Heusel, H

M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter. GANs trained by a two time-scale update rule converge to a local nash equilibrium. InNeurips, volume 30, pages 6629–6640, 2017

work page 2017
[30]

J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. InNeurips, volume 33, pages 6840–6851, 2020

work page 2020
[31]

Hyv¨ arinen

A. Hyv¨ arinen. Estimation of non-normalized statistical models by score matching.J. Mach. Learn. Res., 6:695–709, 2005

work page 2005
[32]

Ignatiadis and B

N. Ignatiadis and B. Sen.Empirical Bayes: From Herbert Robbins to modern theory and appli- cations. 2025. Lecture notes available athttps://nignatiadis.github.io/assets/lecture_notes/ Empirical-Bayes.pdf

work page 2025
[33]

Jeanblanc, M

M. Jeanblanc, M. Yor, and M. Chesney.Mathematical methods for financial markets. Springer Finance. Springer-Verlag London, Ltd., London, 2009

work page 2009
[34]

Karatzas and S

I. Karatzas and S. E. Shreve.Brownian motion and stochastic calculus, volume 113 ofGraduate Texts in Mathematics. Springer-Verlag, New York, second edition, 1991

work page 1991
[35]

Karras, M

T. Karras, M. Aittala, T. Aila, and S. Laine. Elucidating the design space of diffusion-based generative models. InNeurips, volume 35, pages 26565–26577, 2022

work page 2022
[36]

Kawazu and S

K. Kawazu and S. Watanabe. Branching processes with immigration and related limit theorems.Teor. Verojatnost. i Primenen., 16:34–51, 1971

work page 1971
[37]

Mercury: Ultra-Fast Language Models Based on Diffusion

S. Khanna, S. Kharbanda, S. Li, H. Varma, E. Wang, S. Birnbaum, Z. Luo, Y. Miraoui, A. Palrecha, and S. Ermon. Mercury: Ultra-fast language models based on diffusion. 2025. arXiv:2506.17298

work page internal anchor Pith review Pith/arXiv arXiv 2025
[38]

Kim, S.-Y

G. Kim, S.-Y. Choi, and Y. Kim. A diffusion-based generative model for financial time series via geometric Brownian motion. 2025. arXiv:2507.19003

work page arXiv 2025
[39]

Lamperti

J. Lamperti. Continuous state branching processes.Bull. Amer. Math. Soc., 73:382–386, 1967

work page 1967
[40]

H. Lee, J. Lu, and Y. Tan. Convergence for score-based generative modeling with polynomial complexity. InNeurips, volume 35, pages 22870–22882, 2022. 26 WENPIN TANG, NIZAR TOUZI, ZIKUN ZHANG, AND XUN YU ZHOU

work page 2022
[41]

G. Li, Y. Wei, Y. Chen, and Y. Chi. Towards faster non-asymptotic convergence for diffusion-based generative models. InICLR, 2024

work page 2024
[42]

A. Liu, M. He, S. Zeng, S. Zhang, L. Zhang, C. Wu, W. Jia, Y. Liu, X. Zhou, and J. Zhou. WeDLM: Recon- ciling diffusion language models with standard causal attention for fast inference. 2025. arXiv:2512.22737

work page arXiv 2025
[43]

H. Liu, T. Zhu, N. Jia, J. He, and Z. Zheng. Learning to simulate from heavy-tailed distribution via diffusion model. 2024. SSRN 4975931

work page 2024
[44]

Masreliez

C. Masreliez. Approximate non-gaussian filtering with linear state and observation relations.IEEE Trans. Autom. Control, 20(1):107–110, 1975

work page 1975
[45]

Miyasawa

K. Miyasawa. An empirical Bayes estimator of the mean of a normal population.Bull. Inst. Internat. Statist, 38(181-188):1–2, 1961

work page 1961
[46]

S. Nie, F. Zhu, Z. You, X. Zhang, J. Ou, J. Hu, J. Zhou, Y. Lin, J.-R. Wen, and C. Li. Large language diffusion models. 2025. arXiv:2502.09992

work page internal anchor Pith review Pith/arXiv arXiv 2025
[47]

K. Oko, S. Akiyama, and T. Suzuki. Diffusion models are minimax optimal distribution estimators. In ICML, pages 26517–26582, 2023

work page 2023
[48]

Sora: Creating video from text.https://openai.com/sora, 2024

OpenAI. Sora: Creating video from text.https://openai.com/sora, 2024

work page 2024
[49]

L. R. Pericchi and A. F. M. Smith. Exact and approximate posterior moments for a normal location parameter.J. Roy. Statist. Soc. Ser. B, 54(3):793–804, 1992

work page 1992
[50]

Pitman and M

J. Pitman and M. Yor. A decomposition of Bessel bridges.Z. Wahrsch. Verw. Gebiete, 59(4):425–457, 1982

work page 1982
[51]

N. G. Polson. A representation of the posterior mean for a location model.Biometrika, 78(2):426–430, 1991

work page 1991
[52]

Hierarchical Text-Conditional Image Generation with CLIP Latents

A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen. Hierarchical text-conditional image generation with clip latents. arXiv:2204.06125

work page internal anchor Pith review Pith/arXiv arXiv
[53]

Revuz and M

D. Revuz and M. Yor.Continuous martingales and Brownian motion, volume 293 ofGrundlehren der mathematischen Wissenschaften. Springer-Verlag, Berlin, third edition, 1999

work page 1999
[54]

P. H. Richemond, S. Dieleman, and A. Doucet. Categorical SDEs with simplex diffusion. 2022. arXiv:2210.14784

work page arXiv 2022
[55]

H. Robbins. An empirical Bayes approach to statistics. InProceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 1954–1955, vol. I, pages 157–163. Univ. California Press, Berkeley-Los Angeles, Calif., 1956

work page 1954
[56]

L. Rogers. Which model for term-structure of interest rates should one use?IMA, 65:93, 1995

work page 1995
[57]

Rombach, A

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolution image synthesis with latent diffusion models. InCVPR, pages 10684–10695, 2022

work page 2022
[58]

Ronneberger, P

O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmen- tation. InMICCAI, pages 234–241, 2015

work page 2015
[59]

Saremi and A

S. Saremi and A. Hyv¨ arinen. Neural empirical Bayes.J. Mach. Learn. Res., 20:Paper No. 181, 23, 2019

work page 2019
[60]

Shetty, M

N. Shetty, M. Prasath, and C. S. Seelamantula. Dale meets langevin: A multiplicative denoising diffusion model. 2025. arXiv:2510.02730

work page internal anchor Pith review arXiv 2025
[61]

J. Shi, J. Feng, and W. Song. Estimation in linear regression with laplace measurement error using tweedie-type formula.J. Syst. Sci. Complex., 32(4):1211–1230, 2019

work page 2019
[62]

Shirakawa

H. Shirakawa. Squared Bessel processes and their applications to the square root interest rate model. Asia-Pac. Financ. Mark., 9(3):169–190, 2002

work page 2002
[63]

Singer, A

U. Singer, A. Polyak, T. Hayes, X. Yin, J. An, S. Zhang, Q. Hu, H. Yang, O. Ashual, and O. Gafni. Make-a-video: Text-to-video generation without text-video data. InICLR, 2023

work page 2023
[64]

Song and S

Y. Song and S. Ermon. Generative modeling by estimating gradients of the data distribution. InNeurips, volume 32, page 11918–11930, 2019

work page 2019
[65]

Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations. InICLR, 2021

work page 2021
[66]

C. J. Stone. Optimal rates of convergence for nonparametric estimators.Ann. Statist., 8(6):1348–1360, 1980

work page 1980
[67]

C. J. Stone. Optimal global rates of convergence for nonparametric regression.Ann. Statist., 10(4):1040– 1053, 1982

work page 1982
[68]

D. W. Stroock and S. R. S. Varadhan.Multidimensional diffusion processes, volume 233 ofGrundlehren der Mathematischen Wissenschaften. Springer-Verlag, 1979. 27

work page 1979
[69]

Tang and H

W. Tang and H. Zhao. Score-based diffusion models via stochastic differential equations.Statistic Surveys, 19:28–64, 2025

work page 2025
[70]

Tang and H

W. Tang and H. Zhao. Contractive diffusion probabilistic models. 2026. To appear in SIAM J. Imaging Sci

work page 2026
[71]

S. Torres. Tweedie calculus. 2026. arXiv:2604.14486

work page internal anchor Pith review Pith/arXiv arXiv 2026
[72]

P. Vincent. A connection between score matching and denoising autoencoders.Neural Comput., 23(7):1661–1674, 2011

work page 2011
[73]

J. Ye, Z. Xie, L. Zheng, J. Gao, Z. Wu, X. Jiang, Z. Li, and L. Kong. Dream 7b: Diffusion large language models. 2025. arXiv:2508.15487

work page internal anchor Pith review Pith/arXiv arXiv 2025
[74]

Z. Zhao, C. Yeh, L. Kong, and K. Wang. Diffusion-DFL: decision-focused diffusion models for stochastic optimization. InICLR, 2026. Department of Industrial Engineering and Operations Research, Columbia University. Email address:wt2319@columbia.edu Department of Finance and Risk Engineering, New York University. Email address:nt2635@nyu.edu Department of I...

work page 2026

[1] [1]

Aghapour, E

A. Aghapour, E. Bayraktar, and F. Yuan. Solving dynamic portfolio selection problems via score-based diffusion models. 2025. arXiv:2507.09916

work page arXiv 2025

[2] [2]

B. D. O. Anderson. Reverse-time diffusion equation models.Stochastic Process. Appl., 12(3):313–326, 1982

work page 1982

[3] [3]

Avdeyev, C

P. Avdeyev, C. Shi, Y. Tan, K. Dudnyk, and J. Zhou. Dirichlet diffusion score model for biological sequence generation. InICML, pages 1276–1301, 2023

work page 2023

[4] [4]

Benton, V

J. Benton, V. D. Bortoli, A. Doucet, and G. Deligiannidis. Nearlyd-linear convergence bounds for diffusion models via stochastic localization. InICLR, 2024

work page 2024

[5] [5]

Flux.2: Frontier visual intelligence.https://bfl.ai/blog/flux-2, 2025

Black Forest Labs. Flux.2: Frontier visual intelligence.https://bfl.ai/blog/flux-2, 2025

work page 2025

[6] [6]

H. Chen, H. Lee, and J. Lu. Improved analysis of score-based generative modeling: User-friendly bounds under minimal smoothness assumptions. InICML, pages 4735–4763, 2023

work page 2023

[7] [7]

S. Chen, S. Chewi, J. Li, Y. Li, A. Salim, and A. Zhang. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. InICLR, 2023

work page 2023

[8] [8]

J. C. Cox. The constant elasticity of variance option pricing model.J. Portf. Manag., page 15, 1996

work page 1996

[9] [9]

J. C. Cox, J. E. Ingersoll, and S. A. Ross. A theory of the term structure of interest rates.Econometrica, 53(2):385–407, 1985

work page 1985

[10] [10]

D. Dai, J. Fan, Y. Gu, and D. Mukherjee. Cindes: Classification induced neural density estimator and simulator. 2025. arXiv:2510.00367

work page arXiv 2025

[11] [11]

Delbaen and H

F. Delbaen and H. Shirakawa. A note on option pricing for the constant elasticity of variance model. Asia-Pac. Financ. Mark., 9:85–99, 2002. [12]Digital Library of Mathematical Functions.http://dlmf.nist.gov/, Release 1.2.4 of 2025-03-15

work page 2002

[12] [12]

L. E. Dubins and G. Schwarz. On continuous martingales.Proc. Natl. Acad. Sci., 53(5):913–916, 1965. 25

work page 1965

[13] [13]

B. Efron. Microarrays, empirical Bayes and the two-groups model.Stat. Sci., 2008

work page 2008

[14] [14]

B. Efron. Tweedie’s formula and selection bias.J. Amer. Statist. Assoc., 106(496):1602–1614, 2011

work page 2011

[15] [15]

Efron.Large-scale inference: empirical Bayes methods for estimation, testing, and prediction

B. Efron.Large-scale inference: empirical Bayes methods for estimation, testing, and prediction. Cam- bridge University Press, 2012

work page 2012

[16] [16]

Efron and N

B. Efron and N. R. Zhang. False discovery rates and copy number variation.Biometrika, 98(2):251–271, 2011

work page 2011

[17] [17]

D. C. Emanuel and J. D. MacBeth. Further results on the constant elasticity of variance call option pricing model.J. Financ. Quant. Anal., 17(4):533–554, 1982

work page 1982

[18] [18]

J. Fan, Y. Gu, and X. Li. Optimal estimation of a factorizable density using diffusion models with relu neural networks. 2025. arXiv:2510.03994

work page arXiv 2025

[19] [19]

Fitzsimmons, J

P. Fitzsimmons, J. Pitman, and M. Yor. Markovian bridges: construction, Palm interpretation, and splicing. InSeminar on Stochastic Processes, 1992 (Seattle, WA, 1992), volume 33 ofProgr. Probab., pages 101–134. Birkh¨ auser Boston, Boston, MA, 1993

work page 1992

[20] [20]

Floto, T

G. Floto, T. Jonsson, M. Nica, S. Sanner, and E. Z. Zhu. Diffusion on the probability simplex. 2023. arXiv:2309.02530

work page arXiv 2023

[21] [21]

X. Gao, J. Zha, and X. Y. Zhou. Data-driven generative simulation of SDEs using diffusion models. 2025. arXiv:2509.08731

work page arXiv 2025

[22] [22]

Y. Gao, H. Guo, T. Hoang, W. Huang, L. Jiang, F. Kong, H. Li, J. Li, L. Li, and X. Li. Seedance 1.0: Exploring the boundaries of video generation models. 2025. arXiv:2506.09113

work page internal anchor Pith review Pith/arXiv arXiv 2025

[23] [23]

State-of-the-art video and image generation with Veo 2 and Imagen 3.https://blog.google/ technology/google-labs/video-image-generation-update-december-2024/, 2024

Google. State-of-the-art video and image generation with Veo 2 and Imagen 3.https://blog.google/ technology/google-labs/video-image-generation-update-december-2024/, 2024

work page 2024

[24] [24]

I. S. Gradshteyn and I. M. Ryzhik.Table of integrals, series, and products. Elsevier/Academic Press, Amsterdam, eighth edition, 2015

work page 2015

[25] [25]

Gu and R

J. Gu and R. Koenker. Unobserved heterogeneity in income dynamics: An empirical Bayes perspective. J. Bus. Econom. Statist., 35(1):1–16, 2017

work page 2017

[26] [26]

Z. Guo, J. Li, W. Tang, and D. D. Yao. Diffusion generative models meet compressed sensing, with applications to imaging and finance. 2025. arXiv:2509.03898

work page arXiv 2025

[27] [27]

Z. Guo, W. Tang, and R. Xu. Conditional diffusion guidance under hard constraint: a stochastic analysis approach. 2026. arXiv:2602.05533

work page arXiv 2026

[28] [28]

U. G. Haussmann and E. Pardoux. Time reversal of diffusions.Ann. Probab., 14(4):1188–1205, 1986

work page 1986

[29] [29]

Heusel, H

M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter. GANs trained by a two time-scale update rule converge to a local nash equilibrium. InNeurips, volume 30, pages 6629–6640, 2017

work page 2017

[30] [30]

J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. InNeurips, volume 33, pages 6840–6851, 2020

work page 2020

[31] [31]

Hyv¨ arinen

A. Hyv¨ arinen. Estimation of non-normalized statistical models by score matching.J. Mach. Learn. Res., 6:695–709, 2005

work page 2005

[32] [32]

Ignatiadis and B

N. Ignatiadis and B. Sen.Empirical Bayes: From Herbert Robbins to modern theory and appli- cations. 2025. Lecture notes available athttps://nignatiadis.github.io/assets/lecture_notes/ Empirical-Bayes.pdf

work page 2025

[33] [33]

Jeanblanc, M

M. Jeanblanc, M. Yor, and M. Chesney.Mathematical methods for financial markets. Springer Finance. Springer-Verlag London, Ltd., London, 2009

work page 2009

[34] [34]

Karatzas and S

I. Karatzas and S. E. Shreve.Brownian motion and stochastic calculus, volume 113 ofGraduate Texts in Mathematics. Springer-Verlag, New York, second edition, 1991

work page 1991

[35] [35]

Karras, M

T. Karras, M. Aittala, T. Aila, and S. Laine. Elucidating the design space of diffusion-based generative models. InNeurips, volume 35, pages 26565–26577, 2022

work page 2022

[36] [36]

Kawazu and S

K. Kawazu and S. Watanabe. Branching processes with immigration and related limit theorems.Teor. Verojatnost. i Primenen., 16:34–51, 1971

work page 1971

[37] [37]

Mercury: Ultra-Fast Language Models Based on Diffusion

S. Khanna, S. Kharbanda, S. Li, H. Varma, E. Wang, S. Birnbaum, Z. Luo, Y. Miraoui, A. Palrecha, and S. Ermon. Mercury: Ultra-fast language models based on diffusion. 2025. arXiv:2506.17298

work page internal anchor Pith review Pith/arXiv arXiv 2025

[38] [38]

Kim, S.-Y

G. Kim, S.-Y. Choi, and Y. Kim. A diffusion-based generative model for financial time series via geometric Brownian motion. 2025. arXiv:2507.19003

work page arXiv 2025

[39] [39]

Lamperti

J. Lamperti. Continuous state branching processes.Bull. Amer. Math. Soc., 73:382–386, 1967

work page 1967

[40] [40]

H. Lee, J. Lu, and Y. Tan. Convergence for score-based generative modeling with polynomial complexity. InNeurips, volume 35, pages 22870–22882, 2022. 26 WENPIN TANG, NIZAR TOUZI, ZIKUN ZHANG, AND XUN YU ZHOU

work page 2022

[41] [41]

G. Li, Y. Wei, Y. Chen, and Y. Chi. Towards faster non-asymptotic convergence for diffusion-based generative models. InICLR, 2024

work page 2024

[42] [42]

A. Liu, M. He, S. Zeng, S. Zhang, L. Zhang, C. Wu, W. Jia, Y. Liu, X. Zhou, and J. Zhou. WeDLM: Recon- ciling diffusion language models with standard causal attention for fast inference. 2025. arXiv:2512.22737

work page arXiv 2025

[43] [43]

H. Liu, T. Zhu, N. Jia, J. He, and Z. Zheng. Learning to simulate from heavy-tailed distribution via diffusion model. 2024. SSRN 4975931

work page 2024

[44] [44]

Masreliez

C. Masreliez. Approximate non-gaussian filtering with linear state and observation relations.IEEE Trans. Autom. Control, 20(1):107–110, 1975

work page 1975

[45] [45]

Miyasawa

K. Miyasawa. An empirical Bayes estimator of the mean of a normal population.Bull. Inst. Internat. Statist, 38(181-188):1–2, 1961

work page 1961

[46] [46]

S. Nie, F. Zhu, Z. You, X. Zhang, J. Ou, J. Hu, J. Zhou, Y. Lin, J.-R. Wen, and C. Li. Large language diffusion models. 2025. arXiv:2502.09992

work page internal anchor Pith review Pith/arXiv arXiv 2025

[47] [47]

K. Oko, S. Akiyama, and T. Suzuki. Diffusion models are minimax optimal distribution estimators. In ICML, pages 26517–26582, 2023

work page 2023

[48] [48]

Sora: Creating video from text.https://openai.com/sora, 2024

OpenAI. Sora: Creating video from text.https://openai.com/sora, 2024

work page 2024

[49] [49]

L. R. Pericchi and A. F. M. Smith. Exact and approximate posterior moments for a normal location parameter.J. Roy. Statist. Soc. Ser. B, 54(3):793–804, 1992

work page 1992

[50] [50]

Pitman and M

J. Pitman and M. Yor. A decomposition of Bessel bridges.Z. Wahrsch. Verw. Gebiete, 59(4):425–457, 1982

work page 1982

[51] [51]

N. G. Polson. A representation of the posterior mean for a location model.Biometrika, 78(2):426–430, 1991

work page 1991

[52] [52]

Hierarchical Text-Conditional Image Generation with CLIP Latents

A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen. Hierarchical text-conditional image generation with clip latents. arXiv:2204.06125

work page internal anchor Pith review Pith/arXiv arXiv

[53] [53]

Revuz and M

D. Revuz and M. Yor.Continuous martingales and Brownian motion, volume 293 ofGrundlehren der mathematischen Wissenschaften. Springer-Verlag, Berlin, third edition, 1999

work page 1999

[54] [54]

P. H. Richemond, S. Dieleman, and A. Doucet. Categorical SDEs with simplex diffusion. 2022. arXiv:2210.14784

work page arXiv 2022

[55] [55]

H. Robbins. An empirical Bayes approach to statistics. InProceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 1954–1955, vol. I, pages 157–163. Univ. California Press, Berkeley-Los Angeles, Calif., 1956

work page 1954

[56] [56]

L. Rogers. Which model for term-structure of interest rates should one use?IMA, 65:93, 1995

work page 1995

[57] [57]

Rombach, A

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolution image synthesis with latent diffusion models. InCVPR, pages 10684–10695, 2022

work page 2022

[58] [58]

Ronneberger, P

O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmen- tation. InMICCAI, pages 234–241, 2015

work page 2015

[59] [59]

Saremi and A

S. Saremi and A. Hyv¨ arinen. Neural empirical Bayes.J. Mach. Learn. Res., 20:Paper No. 181, 23, 2019

work page 2019

[60] [60]

Shetty, M

N. Shetty, M. Prasath, and C. S. Seelamantula. Dale meets langevin: A multiplicative denoising diffusion model. 2025. arXiv:2510.02730

work page internal anchor Pith review arXiv 2025

[61] [61]

J. Shi, J. Feng, and W. Song. Estimation in linear regression with laplace measurement error using tweedie-type formula.J. Syst. Sci. Complex., 32(4):1211–1230, 2019

work page 2019

[62] [62]

Shirakawa

H. Shirakawa. Squared Bessel processes and their applications to the square root interest rate model. Asia-Pac. Financ. Mark., 9(3):169–190, 2002

work page 2002

[63] [63]

Singer, A

U. Singer, A. Polyak, T. Hayes, X. Yin, J. An, S. Zhang, Q. Hu, H. Yang, O. Ashual, and O. Gafni. Make-a-video: Text-to-video generation without text-video data. InICLR, 2023

work page 2023

[64] [64]

Song and S

Y. Song and S. Ermon. Generative modeling by estimating gradients of the data distribution. InNeurips, volume 32, page 11918–11930, 2019

work page 2019

[65] [65]

Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations. InICLR, 2021

work page 2021

[66] [66]

C. J. Stone. Optimal rates of convergence for nonparametric estimators.Ann. Statist., 8(6):1348–1360, 1980

work page 1980

[67] [67]

C. J. Stone. Optimal global rates of convergence for nonparametric regression.Ann. Statist., 10(4):1040– 1053, 1982

work page 1982

[68] [68]

D. W. Stroock and S. R. S. Varadhan.Multidimensional diffusion processes, volume 233 ofGrundlehren der Mathematischen Wissenschaften. Springer-Verlag, 1979. 27

work page 1979

[69] [69]

Tang and H

W. Tang and H. Zhao. Score-based diffusion models via stochastic differential equations.Statistic Surveys, 19:28–64, 2025

work page 2025

[70] [70]

Tang and H

W. Tang and H. Zhao. Contractive diffusion probabilistic models. 2026. To appear in SIAM J. Imaging Sci

work page 2026

[71] [71]

S. Torres. Tweedie calculus. 2026. arXiv:2604.14486

work page internal anchor Pith review Pith/arXiv arXiv 2026

[72] [72]

P. Vincent. A connection between score matching and denoising autoencoders.Neural Comput., 23(7):1661–1674, 2011

work page 2011

[73] [73]

J. Ye, Z. Xie, L. Zheng, J. Gao, Z. Wu, X. Jiang, Z. Li, and L. Kong. Dream 7b: Diffusion large language models. 2025. arXiv:2508.15487

work page internal anchor Pith review Pith/arXiv arXiv 2025

[74] [74]

Z. Zhao, C. Yeh, L. Kong, and K. Wang. Diffusion-DFL: decision-focused diffusion models for stochastic optimization. InICLR, 2026. Department of Industrial Engineering and Operations Research, Columbia University. Email address:wt2319@columbia.edu Department of Finance and Risk Engineering, New York University. Email address:nt2635@nyu.edu Department of I...

work page 2026