Flicker-DDPM: Accelerating Denoising Diffusion via 1/f Colored Noise Injection

FanCheng Li; Kexiang Mao

arxiv: 2606.03393 · v2 · pith:RLPNP53Cnew · submitted 2026-06-02 · 💻 cs.LG

Flicker-DDPM: Accelerating Denoising Diffusion via 1/f Colored Noise Injection

KeXiang Mao , FanCheng Li This is my paper

Pith reviewed 2026-06-28 11:40 UTC · model grok-4.3

classification 💻 cs.LG

keywords diffusion modelscolored noise1/f noisesampling accelerationCIFAR-10power spectraself-organized criticality

0 comments

The pith

Injecting 1/f colored noise into diffusion models reduces required sampling steps by over three times while preserving image quality.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Flicker-DDPM, which replaces white noise with flicker noise in the forward diffusion process to better align with the power-law spectra of natural images. This change allows the model to generate images of comparable or better quality than standard DDPMs but with significantly fewer denoising steps. The authors provide a frequency-domain theory showing that spectrally matched noise straightens the reverse trajectory, explaining the speedup. Experiments on CIFAR-10 demonstrate the practical gains with minimal extra computation.

Core claim

Flicker-DDPM adopts colored noise with power-law spectra generated via a spatial correlation kernel σ(d) = (d + 1)^{-η}, where tuning η controls the spectral exponent α to match dataset statistics. This spectrally matched noise linearizes the reverse trajectory in the frequency domain, enabling sampling acceleration without quality loss.

What carries the argument

The spatial correlation kernel σ(d) = (d + 1)^{-η} that produces tunable 1/f^α noise to match natural image spectra.

If this is right

On CIFAR-10, achieves equivalent quality with 3.33 times fewer sampling steps.
The acceleration comes at negligible additional cost per step.
The frequency-domain linear theory accounts for the observed speedup.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method may extend to other datasets by tuning η to their spectral properties.
It points to noise spectrum as a tunable parameter for diffusion model efficiency.
Testing on different image resolutions could reveal if the acceleration holds broadly.

Load-bearing premise

That the spectral match between the injected noise and natural images is the key factor allowing faster sampling without degrading generation quality.

What would settle it

Running the model with a mismatched η that produces noise spectra unlike the dataset's and checking whether the sampling speedup disappears while quality stays the same or worsens.

Figures

Figures reproduced from arXiv: 2606.03393 by FanCheng Li, Kexiang Mao.

**Figure 2.** Figure 2: FIG. 2. Schematic of Flicker-DDPM [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3. Final FID scores as a function of diffusion steps [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5. Generated samples at [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6. Radial power spectrum [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 8.** Figure 8: focuses on k = 3, the mid-low frequency range where noise–signal competition is strongest. For FlickerDDPM, the measured D(k=3, n) closely tracks the linear ODE prediction throughout the entire reverse trajectory. For white DDPM, the actual dynamics at k = 3 remain effectively frozen near unity, while the network preferentially rebuilds low-frequency power (k = 1–2) through nonlinear mode coupling. High-… view at source ↗

read the original abstract

We propose a novel diffusion model, Flicker-DDPM, which incorporates flicker (1/f) noise inspired by self-organized criticality (SOC), a widely observed phenomenon in natural systems. Unlike denoising diffusion probabilistic models (DDPMs), which employ isotropic white noise in the forward process, Flicker-DDPM adopts colored noise with power-law spectra to better match the spectral statistics of natural images, whose power spectra typically follow P(k) proportional to 1/k^{\alpha}. To this end, we develop a colored-noise module based on a spatial correlation kernel, {\sigma}(d) = (d + 1)^{-\eta}, and theoretically establish that adjusting {\eta} controls the spectral exponent {\alpha} of the generated 1/f{\alpha} noise, enabling adaptation to datasets with diverse spectral characteristics. On CIFAR-10, Flicker DDPM matches or surpasses the generation quality of a standard DDPM baseline using 3.33 times fewer sampling steps, with negligible additional computational cost per step. We further develop a frequency-domain linear theory demonstrating that spectrally matched colored noise linearizes the reverse trajectory, theoretically explaining the observed sampling acceleration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Flicker-DDPM claims a 3x sampling speedup on CIFAR-10 from 1/f noise plus a linear frequency theory, but the theory looks incompatible with the nonlinear U-Net.

read the letter

The main takeaway is that Flicker-DDPM claims to cut the number of sampling steps by a factor of about 3.3 on CIFAR-10 while keeping generation quality the same or better, by swapping in colored 1/f noise for the usual white noise. They give a spatial kernel to generate it and a linear frequency-domain argument for why it speeds things up.

What the paper does is lay out a concrete way to control the noise spectrum through the parameter η in σ(d) = (d + 1)^{-η}, tying it to self-organized criticality ideas. The low overhead per step is a plus if the result replicates.

The soft spots are the lack of visible derivations or full experimental protocols in what we have, and especially the linear theory. Since the score is estimated by a nonlinear U-Net, it's not obvious that a linear analysis in frequency space explains the trajectory straightening. That assumption needs checking against the actual model.

This work would interest people focused on practical speedups in diffusion sampling. A reader looking for new noise schedules or spectrum-aware methods could find the kernel useful. The central claim is specific enough that it should go to peer review so the experiments and theory can be scrutinized properly.

Referee Report

2 major / 2 minor

Summary. The paper proposes Flicker-DDPM, a diffusion model variant that replaces isotropic white noise in the DDPM forward process with 1/f^α colored noise generated via the spatial correlation kernel σ(d)=(d+1)^{-η}. By tuning η the spectral exponent α is controlled to match natural-image power spectra P(k)∝1/k^α. On CIFAR-10 the method is reported to match or exceed a standard DDPM baseline in generation quality while using 3.33× fewer sampling steps at negligible extra per-step cost. A frequency-domain linear theory is presented to explain the speedup by showing that spectrally matched colored noise linearizes the reverse trajectory.

Significance. If the empirical speedup and the supporting theory hold, the work would offer a practical route to faster sampling in diffusion models together with an explanatory mechanism grounded in spectral statistics. The SOC-inspired noise construction and the explicit link between η and α constitute a concrete, tunable mechanism that could generalize beyond CIFAR-10.

major comments (2)

[frequency-domain linear theory] Frequency-domain linear theory (abstract and theory section): the central explanatory claim states that spectrally matched colored noise linearizes the reverse trajectory. However, the reverse process is realized by a U-Net that approximates a nonlinear score function. The manuscript does not state or justify the conditions under which the linear frequency-domain analysis remains valid once the nonlinear denoiser is inserted; without this justification the theory does not yet support the reported acceleration mechanism.
[results / experiments] Empirical claim (results section): the 3.33× step reduction on CIFAR-10 is load-bearing for the paper’s contribution, yet the abstract provides no information on the precise sampling schedule, number of training steps, FID computation protocol, or whether the same U-Net architecture and training budget were used for both Flicker-DDPM and the baseline. These details are required to assess whether the speedup is attributable to the colored noise rather than to other implementation differences.

minor comments (2)

[method / colored-noise module] The abstract states that the colored-noise module incurs “negligible additional computational cost per step,” but the manuscript should quantify the overhead of sampling from the spatial kernel σ(d) (e.g., via FFT or direct convolution) relative to standard Gaussian sampling.
[abstract / introduction] Notation: the abstract writes P(k) proportional to 1/k^α and later 1/f^α; a single consistent symbol (k or f) should be used throughout.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. We address each major comment below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: [frequency-domain linear theory] Frequency-domain linear theory (abstract and theory section): the central explanatory claim states that spectrally matched colored noise linearizes the reverse trajectory. However, the reverse process is realized by a U-Net that approximates a nonlinear score function. The manuscript does not state or justify the conditions under which the linear frequency-domain analysis remains valid once the nonlinear denoiser is inserted; without this justification the theory does not yet support the reported acceleration mechanism.

Authors: We agree that the frequency-domain linear theory is an approximation that does not rigorously account for the nonlinearity of the learned score function. The analysis is meant to provide mechanistic intuition for why spectral matching can accelerate sampling in the linear regime, with empirical results serving as the primary validation. In revision we will add an explicit limitations paragraph in the theory section stating the assumptions (e.g., approximate linearity of the score in the frequency domain for the early reverse steps) and clarifying that the linear model is explanatory rather than a complete proof for the nonlinear U-Net case. revision: yes
Referee: [results / experiments] Empirical claim (results section): the 3.33× step reduction on CIFAR-10 is load-bearing for the paper’s contribution, yet the abstract provides no information on the precise sampling schedule, number of training steps, FID computation protocol, or whether the same U-Net architecture and training budget were used for both Flicker-DDPM and the baseline. These details are required to assess whether the speedup is attributable to the colored noise rather than to other implementation differences.

Authors: All implementation details (identical U-Net architecture, training budget, FID protocol, and the exact linear schedule with 300 steps for Flicker-DDPM versus 1000 for the baseline) are reported in Section 4 and the supplementary material. We nevertheless accept that the abstract should be self-contained on these points. We will revise the abstract to state that the same architecture and training budget were used and to specify the sampling schedules compared. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's core claims consist of an empirical demonstration on CIFAR-10 (matching baseline quality at 3.33× fewer steps) and a derived frequency-domain linear theory linking spectrally matched colored noise (via the kernel σ(d)=(d+1)^{-η} controlling α) to linearized reverse trajectories. These elements are presented as independent theoretical derivations and experimental results rather than reductions to fitted inputs or self-citations; the provided text contains no load-bearing self-citation chains, no renaming of known results as novel, and no predictions that collapse by construction to the model's own parameters or observations.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Assessment based solely on abstract; full parameter lists and derivations unavailable.

free parameters (1)

η
Single parameter in the correlation kernel that sets the spectral exponent α of the injected noise.

axioms (1)

domain assumption Natural images exhibit power spectra P(k) proportional to 1/k^α
Used to justify replacing white noise with 1/f noise.

pith-pipeline@v0.9.1-grok · 5733 in / 1160 out tokens · 25347 ms · 2026-06-28T11:40:26.487037+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

67 extracted references · 12 canonical work pages · 7 internal anchors

[1]

1/f noise: a pedagogical review

E. Milotti, 1/f noise: a pedagogical re- view, arXiv preprint physics/0204033 10.48550/arXiv.physics/0204033 (2002)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.physics/0204033 2002
[2]

Hooge, T

F. Hooge, T. Kleinpenning, and L. K. Vandamme, Ex- perimental studies on 1/f noise, Reports on progress in Physics44, 479 (1981)

1981
[3]

Uttley, I

P. Uttley, I. McHardy, and I. Papadakis, Measuring the broad-band power spectra of active galactic nuclei with rxte, Monthly Notices of the Royal Astronomical Society 332, 231 (2002)

2002
[4]

K. A. Dill, S. B. Ozkan, M. S. Shell, and T. R. Weikl, The protein folding problem, Annu. Rev. Biophys.37, 289 (2008)

2008
[5]

Masuki and Y

K. Masuki and Y. Ashida, Generative diffusion model with inverse renormalization group flows, arXiv preprint arXiv:2501.09064 10.48550/arXiv.2501.09064 (2025)

work page doi:10.48550/arxiv.2501.09064 2025
[6]

C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons, and H. E. Stanley, Long-range correlations in nucleotide sequences, Nature356, 168 (1992)

1992
[7]

R. F. Voss and J. Clarke, 1/f noise in speech and music, Nature258, 317 (1975)

1975
[8]

D. L. Ruderman, The statistics of natural images, Net- work: computation in neural systems5, 517 (1994)

1994
[9]

Torralba and A

A. Torralba and A. Oliva, Statistics of natural image cat- egories, Network: Computation in Neural Systems14, 391 (2003)

2003
[10]

Ruderman and W

D. Ruderman and W. Bialek, Statistics of natural im- ages: Scaling in the woods, Advances in neural informa- tion processing systems6(1993)

1993
[11]

v. A. Van der Schaaf and J. v. van Hateren, Modelling the power spectra of natural images: statistics and infor- mation, Vision research36, 2759 (1996)

1996
[12]

Saremi and T

S. Saremi and T. J. Sejnowski, Hierarchical model of natural images and the origin of scale invariance, Pro- ceedings of the National Academy of Sciences110, 3071 (2013)

2013
[13]

P. Bak, C. Tang, and K. Wiesenfeld, Self-organized crit- icality: An explanation of the 1/f noise, Physical review letters59, 381 (1987)

1987
[14]

P. Bak, C. Tang, and K. Wiesenfeld, Self-organized crit- icality, Physical review A38, 364 (1988)

1988
[15]

J. Ho, A. Jain, and P. Abbeel, Denoising diffusion proba- bilistic models, Advances in neural information process- ing systems33, 6840 (2020)

2020
[16]

Sohl-Dickstein, E

J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, Deep unsupervised learning using nonequi- librium thermodynamics, inInternational conference on machine learning(pmlr, 2015) pp. 2256–2265

2015
[17]

Dhariwal and A

P. Dhariwal and A. Nichol, Diffusion models beat GANs on image synthesis, inAdvances in Neural Information Processing Systems, Vol. 34 (2021) pp. 8780–8794

2021
[18]

Hoogeboom, V

E. Hoogeboom, V. G. Satorras, C. Vignac, and M. Welling, Equivariant diffusion for molecule generation in 3d, inInternational conference on machine learning (PMLR, 2022) pp. 8867–8887

2022
[19]

Z. Kong, W. Ping, J. Huang, K. Zhao, and B. Catanzaro, Diffwave: A versatile diffusion model for audio synthesis, arXiv preprint arXiv:2009.09761 10.48550/arXiv.2009.09761 (2020)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2009.09761 2009
[20]

J. L. Watson, D. Juergens, N. R. Bennett, B. L. Trippe, J. Yim, H. E. Eisenach, W. Ahern, A. J. Borst, R. J. Ragotte, L. F. Milles,et al., De novo design of protein structure and function with rfdiffusion, Nature620, 1089 (2023)

2023
[21]

Abramson, J

J. Abramson, J. Adler, J. Dunger, R. Evans, T. Green, A. Pritzel, O. Ronneberger, L. Willmore, A. J. Bal- lard, J. Bambrick,et al., Accurate structure prediction of biomolecular interactions with alphafold 3, Nature630, 493 (2024)

2024
[22]

Rahaman, A

N. Rahaman, A. Baratin, D. Arber, F. Draxler, M. Lin, F. Hamprecht, Y. Bengio, and A. Courville, On the spec- tral bias of neural networks, inInternational Conference on Machine Learning(PMLR, 2019) pp. 5301–5310

2019
[23]

S. Lin, B. Liu, J. Li, and X. Yang, Common diffusion noise schedules and sample steps are flawed, inProceed- ings of the IEEE/CVF winter conference on applications of computer vision(2024) pp. 5404–5411

2024
[24]

Kingma, T

D. Kingma, T. Salimans, B. Poole, and J. Ho, Variational diffusion models, inAdvances in Neural Information Pro- cessing Systems, Vol. 34 (2021) pp. 21696–21707

2021
[25]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio

R. Gaoet al., A Fourier space perspective on diffusion models, arXiv:2505.11278 10.48550/arXiv.2505.11278 (2025)

work page doi:10.48550/arxiv.2505.11278 2025
[26]

Jiralerspong, B

T. Jiralerspong, B. Earnshaw, J. Hartford, Y. Bengio, and L. Scimeca, Shaping inductive bias in diffusion mod- els through frequency-based noise control, arXiv preprint arXiv:2502.10236 10.48550/arXiv.2502.10236 (2025)

work page doi:10.48550/arxiv.2502.10236 2025
[27]

Chandran, N

S. Chandran, N. R. d. Santos, Y. Wu, G. V. Steeg, and E. Papalexakis, Spectral regularization for diffusion models, arXiv preprint arXiv:2603.02447 10.48550/arXiv.2603.02447 (2026)

work page doi:10.48550/arxiv.2603.02447 2026
[28]

Rissanen, M

S. Rissanen, M. Heinonen, and A. Solin, Generative modelling with inverse heat dissipation, arXiv preprint arXiv:2206.13397 10.48550/arXiv.2206.13397 (2022)

work page doi:10.48550/arxiv.2206.13397 2022
[29]

C. Berg, J. P. R. Christensen, and P. Ressel,Harmonic analysis on semigroups: theory of positive definite and related functions, Vol. 100 (Springer, 1984). 9

1984
[30]

Huang, C

X. Huang, C. Salaun, C. Vasconcelos, C. Theobalt, C. Oztireli, and G. Singh, Blue noise for diffusion models, inACM SIGGRAPH 2024 conference papers(2024) pp. 1–11

2024
[31]

Mat´ ern,Spatial variation(Springer Science & Busi- ness Media, 2013)

B. Mat´ ern,Spatial variation(Springer Science & Busi- ness Media, 2013)

2013
[32]

C. K. Williams and C. E. Rasmussen,Gaussian processes for machine learning, Vol. 2 (MIT press Cambridge, MA, 2006)

2006
[33]

Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, Score-based generative modeling through stochastic differential equations, arXiv preprint arXiv:2011.13456 10.48550/arXiv.2011.13456 (2020)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2011.13456 2011
[34]

Lipman, R

Y. Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le, Flow matching for generative modeling, inInter- national Conference on Learning Representations(2023)

2023
[35]

J. Song, C. Meng, and S. Ermon, Denoising diffu- sion implicit models, arXiv preprint arXiv:2010.02502 10.48550/arXiv.2010.02502 (2020)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2010.02502 2010
[36]

Progressive Distillation for Fast Sampling of Diffusion Models

T. Salimans and J. Ho, Progressive distillation for fast sampling of diffusion models, arXiv preprint arXiv:2202.00512 10.48550/arXiv.2202.00512 (2022)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2202.00512 2022
[37]

Krizhevsky, G

A. Krizhevsky, G. Hinton,et al.,Learning multiple layers of features from tiny images, Tech. Rep. (2009)

2009
[38]

Ronneberger, P

O. Ronneberger, P. Fischer, and T. Brox, U-net: Con- volutional networks for biomedical image segmentation, inInternational Conference on Medical image computing and computer-assisted intervention(Springer, 2015) pp. 234–241

2015
[39]

A. Q. Nichol and P. Dhariwal, Improved denoising diffu- sion probabilistic models, inInternational Conference on Machine Learning(PMLR, 2021) pp. 8162–8171

2021
[40]

Zou, Denoising diffusion probabilis- tic model,https://github.com/zoubohao/ DenoisingDiffusionProbabilityModel-ddpm-(2021), gitHub repository

B. Zou, Denoising diffusion probabilis- tic model,https://github.com/zoubohao/ DenoisingDiffusionProbabilityModel-ddpm-(2021), gitHub repository

2021
[41]

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 10.48550/arXiv.1412.6980 (2014)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1412.6980 2014
[42]

GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, GANs trained by a two time-scale update rule converge to a local nash equilibrium, Ad- vances in neural information processing systems30, 10.48550/arXiv.1706.08500 (2017)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1706.08500 2017
[43]

P. C. Martin, E. D. Siggia, and H. A. Rose, Statistical dynamics of classical systems, Physical Review A8, 423 (1973)

1973
[44]

H.-K. Janssen, On a lagrangean for classical field dynam- ics and renormalization group calculations of dynamical critical properties, Zeitschrift f¨ ur Physik B Condensed Matter23, 377 (1976)

1976
[45]

B. D. Anderson, Reverse-time diffusion equation mod- els, Stochastic Processes and their Applications12, 313 (1982)

1982
[46]

Song and P

Y. Song and P. Dhariwal, Improved techniques for train- ing consistency models, inInternational Conference on Learning Representations, Vol. 2024 (2024) pp. 15078– 15097

2024
[47]

Karras, M

T. Karras, M. Aittala, T. Aila, and S. Laine, Elucidat- ing the design space of diffusion-based generative models, inAdvances in Neural Information Processing Systems, Vol. 35 (2022) pp. 26565–26577

2022
[48]

Gardiner,Stochastic methods, Vol

C. Gardiner,Stochastic methods, Vol. 4 (Springer Berlin Heidelberg, 2009)

2009
[49]

Altland and B

A. Altland and B. D. Simons,Condensed matter field theory(Cambridge university press, 2010). 10 Appendix A: Derivation of the Power Spectrum ODE We derive Eq. (15) from first principles. The starting point is the VP-type reverse SDE in real space: dx(r, t) = 1 2 β(t)x(r, t) +β(t)s θ(r;{x}, t) dt+ p β(t)dw(r, t),(A1) wheres θ =∇ x logp t(x) is the score fu...

2010
[50]

Fourier transform Define the unitary discrete Fourier transform (DFT): ˜x(k, t) = 1 N d/2 X r x(r, t)e −ik·r.(A2) Since the DFT is linear, applying it to Eq. (A1) yields d˜x(k) = 1 2 β˜x(k) +β˜sθ(k;{˜x}) dt+ q β ˜Σ(k)d˜w(k),(A3) where ˜Σ(k) is the Fourier eigenvalue of the noise covariance (˜Σ = 1 for white noise), and the transformed noise satisfies ⟨d˜w...
[51]

A translation-invariant operator is diagonalized by the DFT with eigenvalues Λ(k, t) =P r J(r)e −ik·r

Linearization of the score function For data drawn from a translationally invariant distribution, the JacobianJ(r,r ′) =∂s θ(r)/∂x(r′) depends only onr−r ′ in a statistical sense. A translation-invariant operator is diagonalized by the DFT with eigenvalues Λ(k, t) =P r J(r)e −ik·r. Thus the first-order Taylor expansion of ˜sθ in Fourier space is mode-diag...
[52]

Since⟨d˜w⟩= 0, the mean satisfiesdm/dt= µ m+βc, and the fluctuation obeys [48]: dy=µ y dt+ q β ˜Σd˜w.(A6) Fork̸= 0,yis complex:y=y R +iy I

Itˆ o’s lemma for the power spectrum Define the meanm(t)≡ ⟨˜x(t)⟩and fluctuationy(t)≡˜x(t)−m(t). Since⟨d˜w⟩= 0, the mean satisfiesdm/dt= µ m+βc, and the fluctuation obeys [48]: dy=µ y dt+ q β ˜Σd˜w.(A6) Fork̸= 0,yis complex:y=y R +iy I. The complex noise decomposes asd˜w= (dw R +i dw I)/ √ 2, so: dyR =µ y R dt+ q β ˜Σ/2dw R, dy I =µ y I dt+ q β ˜Σ/2dw I ,...
[53]

Relation betweenε θ andγ In DDPM, the noise-prediction network satisfiesε θ =− √1−¯αt sθ, wheres θ is the score function. Under the linearization (A4), in Fourier space: ˜sθ(k) =−γ(k, t) ˜xt(k) +c(k, t),(B1) so that ˜εθ(k) = √ 1−¯αt γ(k, t) ˜xt(k) + const.(B2) Thereforeγ(k, t) equals the slope of ˜ε θ regressed on ˜xt, divided by √1−¯αt: γ(k, t) = 1√1−¯αt...
[54]

Discrete variance propagation The DDPM reverse update rule is: xt−1 = 1√αt xt − 1−α t√1−¯αt εθ(xt, t) +σ t z,(B4) wherez∼ N(0,Σ). Under linearization (B2), the Fourier-space update becomes: ˜xt−1(k) = 1−(1−α t)γ(k, t)√αt ˜xt(k) + mean-field +σt ˜z(k).(B5) Taking the variance (fluctuation part only) yields the discrete power-spectrum recurrence: Dt−1(k) = ...
[55]

Its 2D Fourier transform (power spectrum) is: ˜ΣMat(k) =σ 2 (k2 +κ 2)−(ν+1), κ≡1/ξ.(D2)

Mat´ ern covariance and power spectrum The Mat´ ern covariance inddimensions is: Cν(r) = σ2 2ν−1Γ(ν) r ξ ν Kν r ξ ,(D1) whereK ν is the modified Bessel function of the second kind,ξis the correlation length, andν >0 controls smoothness. Its 2D Fourier transform (power spectrum) is: ˜ΣMat(k) =σ 2 (k2 +κ 2)−(ν+1), κ≡1/ξ.(D2)
[56]

High-frequency asymptotic Fork≫κ(which holds for all observable modesk∈[1,14] on a 32×32 grid withκ≈1): ˜ΣMat(k)≈σ 2 k−2(ν+1).(D3) Matching to the measured data spectrumP data(k)∝k −α gives: α= 2(ν+ 1) =⇒ν= α−2 2 .(D4) 13
[57]

Our kernelC(r) = (r+ 1) −η ∼r −η for larger

Real-space asymptotic and envelope matching The large-rasymptotic of the Mat´ ern correlation is: Cν(r)∼A r ν−1/2 e−r/ξ, r→ ∞.(D5) On a finite lattice (N= 32), whenκis small, the exponential factore −r/ξ varies slowly over the accessible range r∈[1, N], and the correlation shape is dominated by the algebraic enveloper ν−1/2. Our kernelC(r) = (r+ 1) −η ∼r ...
[58]

On a finite grid, the relevant frequency scale is the mid-frequency rangek ∗ ≈3–7 where signal–noise competition is strongest

Sub-leading correction The leading-order prediction uses the asymptotic (k→ ∞) spectral exponentα ∞. On a finite grid, the relevant frequency scale is the mid-frequency rangek ∗ ≈3–7 where signal–noise competition is strongest. The Mat´ ern local log-slope is: αloc(k) = 2(ν+ 1) k2 k2 +κ 2 < α ∞.(D9) Usingα eff(k∗) in place ofα ∞ gives the corrected formul...
[59]

Step 1(noise probability):P[ξ]∝exp − 1 2 R T 0 ξ2 dt

Construction of the MSRJD action For a singlek-mode (suppressingklabels), the linearized reverse SDE (A5) for the real component reads: ˙x=µ(t)x+f(t) + σ(t)√ 2 ξ(t),(E1) whereσ(t) = q β(t) ˜Σ andξis real white noise. Step 1(noise probability):P[ξ]∝exp − 1 2 R T 0 ξ2 dt . 14 Step 2(enforce SDE viaδ-function): Insert 1 = R Dˆpexp i R ˆp[ ˙x−µx−f− σ√ 2 ξ]dt ...
[60]

Propagators The retarded (causal) Green’s function satisfies [∂ t −µ(t)]G R(t, t′) =δ(t−t ′) withG R(t, t′) = 0 fort < t ′: GR(t, t′) =θ(t−t ′) exp Z t t′ µ(τ)dτ .(E4) The Keldysh propagator (equal-time limit gives the power spectrum): GK(t, t′) = Z dτ G R(t, τ)σ(τ) 2 GR(t′, τ) ∗.(E5) Settingt=t ′:D(t) =G K(t, t) = R t −∞ |GR(t, τ)|2 σ(τ) 2 dτ
[61]

Substitutingµ=β( 1 2 −γ) andσ 2 =β ˜Σ: dD dt =β ˜Σ + 2β 1 2 −γ D=β (1−2γ)D+ ˜Σ .(E7) This is identical to Eq

Recovery of the power-spectrum ODE DifferentiatingD(t) =G K(t, t) using the Leibniz rule andG R(t, t) = 1: dD dt =|G R(t, t)|2σ2(t) + Z t −∞ ∂ ∂t |GR(t, τ)|2 σ2(τ)dτ =σ 2(t) + 2µ(t) Z t −∞ |GR(t, τ)|2 σ2(τ)dτ =σ 2(t) + 2µ(t)D(t),(E6) where we used∂ tGR(t, τ) =µ(t)G R(t, τ) fort > τ. Substitutingµ=β( 1 2 −γ) andσ 2 =β ˜Σ: dD dt =β ˜Σ + 2β 1 2 −γ D=β (1−2γ)...
[62]

This follows because the transla- tion operatorT a acts as ˜x(k)→e ik·a˜x(k), and equivariance ˜sθ(k;{T ax}) =e ik·a˜sθ(k;{x}) requirese ik·a =e i(k1+k2)·a for alla

Higher-order expansion of the score function Beyond the linear approximation (A4), the score function admits a systematic expansion: ˜sθ(k) =−γ(k, t) ˜x(k) +c(k, t) + X k1+k2=k V3(k;k 1,k 2;t) ˜x(k1)˜x(k2) +· · ·(F1) 15 where the three-wave coupling vertex is: V3(k;k 1,k 2;t) = 1 2 ∂2˜sθ(k) ∂˜x(k1)∂˜x(k2) .(F2) Translational invariance enforces momentum c...
[63]

Interacting MSRJD action The full action decomposes asS=S 0 +S int, whereS 0 is the Gaussian action (E3) summed over allk, and: Sint =− X k Z dt β(t) ˜p∗(k) X k1+k2=k V3(k;k 1,k 2) ˜x(k1)˜x(k2) +· · ·(F3) EachV 3 vertex connects one response field ˜p∗(k) to two physical fields ˜x(k1),˜x(k2), carrying algebraic weight−β V 3
[64]

Feynman rules The free propagators fromS 0 are: GR(ω;k) =⟨˜x(k) ˜p∗(k)⟩0 = −1 iω+µ(k) ,(F4) GK(ω;k) =⟨˜x(k) ˜x∗(k)⟩0 = β ˜Σ(k) ω2 +µ(k) 2 ,(F5) ⟨˜p˜p∗⟩0 = 0.(F6) The vanishing of⟨˜p˜p∗⟩ensures that every closed loop must contain at least oneG K line—a causality constraint intrinsic to the MSRJD formalism
[65]

Single-loop self-energy The lowest-order correction to the retarded propagator is the single-loop (“sunset”) diagram with twoV 3 vertices: δΣR(k, ω) = 2β2X k1 |V3(k;k 1,k−k 1)|2 Z dω1 2π GK(ω1;k 1)G R(ω−ω1;k−k 1).(F7) The factor of 2 is the symmetry factor from exchanging the two ˜xlegs at each vertex
[66]

Withµ 1 ≡µ(k 1),µ q ≡µ(k−k 1),σ 2 1 ≡β ˜Σ(k1): I= Z dω1 2π σ2 1 ω2 1 +µ 2 1 · −1 i(ω−ω 1) +µ q .(F8) The integrand has poles atω 1 =±iµ 1 (fromG K) andω 1 =ω+iµ q (fromG R)

Frequency integral Theω 1 integral is evaluated by contour integration. Withµ 1 ≡µ(k 1),µ q ≡µ(k−k 1),σ 2 1 ≡β ˜Σ(k1): I= Z dω1 2π σ2 1 ω2 1 +µ 2 1 · −1 i(ω−ω 1) +µ q .(F8) The integrand has poles atω 1 =±iµ 1 (fromG K) andω 1 =ω+iµ q (fromG R). Closing the contour in the upper half-plane (forµ 1 <0, the pole atω 1 =iµ 1 lies in the upper half-plane): I= ...
[67]

SinceδΣ R <0 (Eq

Physical interpretation The self-energyδΣ R modifies the effective drift:µ eff(k) =µ(k) +δΣ R(k,0). SinceδΣ R <0 (Eq. F10), the non- linear couplingenhancesthe effective restoring force—other modes’ fluctuations, mediated byV 3, provide additional damping. The full (dressed) propagator satisfies the Dyson equation: GR full(ω;k) = −1 iω+µ(k) +δΣ R(k, ω) ,(...

[1] [1]

1/f noise: a pedagogical review

E. Milotti, 1/f noise: a pedagogical re- view, arXiv preprint physics/0204033 10.48550/arXiv.physics/0204033 (2002)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.physics/0204033 2002

[2] [2]

Hooge, T

F. Hooge, T. Kleinpenning, and L. K. Vandamme, Ex- perimental studies on 1/f noise, Reports on progress in Physics44, 479 (1981)

1981

[3] [3]

Uttley, I

P. Uttley, I. McHardy, and I. Papadakis, Measuring the broad-band power spectra of active galactic nuclei with rxte, Monthly Notices of the Royal Astronomical Society 332, 231 (2002)

2002

[4] [4]

K. A. Dill, S. B. Ozkan, M. S. Shell, and T. R. Weikl, The protein folding problem, Annu. Rev. Biophys.37, 289 (2008)

2008

[5] [5]

Masuki and Y

K. Masuki and Y. Ashida, Generative diffusion model with inverse renormalization group flows, arXiv preprint arXiv:2501.09064 10.48550/arXiv.2501.09064 (2025)

work page doi:10.48550/arxiv.2501.09064 2025

[6] [6]

C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons, and H. E. Stanley, Long-range correlations in nucleotide sequences, Nature356, 168 (1992)

1992

[7] [7]

R. F. Voss and J. Clarke, 1/f noise in speech and music, Nature258, 317 (1975)

1975

[8] [8]

D. L. Ruderman, The statistics of natural images, Net- work: computation in neural systems5, 517 (1994)

1994

[9] [9]

Torralba and A

A. Torralba and A. Oliva, Statistics of natural image cat- egories, Network: Computation in Neural Systems14, 391 (2003)

2003

[10] [10]

Ruderman and W

D. Ruderman and W. Bialek, Statistics of natural im- ages: Scaling in the woods, Advances in neural informa- tion processing systems6(1993)

1993

[11] [11]

v. A. Van der Schaaf and J. v. van Hateren, Modelling the power spectra of natural images: statistics and infor- mation, Vision research36, 2759 (1996)

1996

[12] [12]

Saremi and T

S. Saremi and T. J. Sejnowski, Hierarchical model of natural images and the origin of scale invariance, Pro- ceedings of the National Academy of Sciences110, 3071 (2013)

2013

[13] [13]

P. Bak, C. Tang, and K. Wiesenfeld, Self-organized crit- icality: An explanation of the 1/f noise, Physical review letters59, 381 (1987)

1987

[14] [14]

P. Bak, C. Tang, and K. Wiesenfeld, Self-organized crit- icality, Physical review A38, 364 (1988)

1988

[15] [15]

J. Ho, A. Jain, and P. Abbeel, Denoising diffusion proba- bilistic models, Advances in neural information process- ing systems33, 6840 (2020)

2020

[16] [16]

Sohl-Dickstein, E

J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, Deep unsupervised learning using nonequi- librium thermodynamics, inInternational conference on machine learning(pmlr, 2015) pp. 2256–2265

2015

[17] [17]

Dhariwal and A

P. Dhariwal and A. Nichol, Diffusion models beat GANs on image synthesis, inAdvances in Neural Information Processing Systems, Vol. 34 (2021) pp. 8780–8794

2021

[18] [18]

Hoogeboom, V

E. Hoogeboom, V. G. Satorras, C. Vignac, and M. Welling, Equivariant diffusion for molecule generation in 3d, inInternational conference on machine learning (PMLR, 2022) pp. 8867–8887

2022

[19] [19]

Z. Kong, W. Ping, J. Huang, K. Zhao, and B. Catanzaro, Diffwave: A versatile diffusion model for audio synthesis, arXiv preprint arXiv:2009.09761 10.48550/arXiv.2009.09761 (2020)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2009.09761 2009

[20] [20]

J. L. Watson, D. Juergens, N. R. Bennett, B. L. Trippe, J. Yim, H. E. Eisenach, W. Ahern, A. J. Borst, R. J. Ragotte, L. F. Milles,et al., De novo design of protein structure and function with rfdiffusion, Nature620, 1089 (2023)

2023

[21] [21]

Abramson, J

J. Abramson, J. Adler, J. Dunger, R. Evans, T. Green, A. Pritzel, O. Ronneberger, L. Willmore, A. J. Bal- lard, J. Bambrick,et al., Accurate structure prediction of biomolecular interactions with alphafold 3, Nature630, 493 (2024)

2024

[22] [22]

Rahaman, A

N. Rahaman, A. Baratin, D. Arber, F. Draxler, M. Lin, F. Hamprecht, Y. Bengio, and A. Courville, On the spec- tral bias of neural networks, inInternational Conference on Machine Learning(PMLR, 2019) pp. 5301–5310

2019

[23] [23]

S. Lin, B. Liu, J. Li, and X. Yang, Common diffusion noise schedules and sample steps are flawed, inProceed- ings of the IEEE/CVF winter conference on applications of computer vision(2024) pp. 5404–5411

2024

[24] [24]

Kingma, T

D. Kingma, T. Salimans, B. Poole, and J. Ho, Variational diffusion models, inAdvances in Neural Information Pro- cessing Systems, Vol. 34 (2021) pp. 21696–21707

2021

[25] [25]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio

R. Gaoet al., A Fourier space perspective on diffusion models, arXiv:2505.11278 10.48550/arXiv.2505.11278 (2025)

work page doi:10.48550/arxiv.2505.11278 2025

[26] [26]

Jiralerspong, B

T. Jiralerspong, B. Earnshaw, J. Hartford, Y. Bengio, and L. Scimeca, Shaping inductive bias in diffusion mod- els through frequency-based noise control, arXiv preprint arXiv:2502.10236 10.48550/arXiv.2502.10236 (2025)

work page doi:10.48550/arxiv.2502.10236 2025

[27] [27]

Chandran, N

S. Chandran, N. R. d. Santos, Y. Wu, G. V. Steeg, and E. Papalexakis, Spectral regularization for diffusion models, arXiv preprint arXiv:2603.02447 10.48550/arXiv.2603.02447 (2026)

work page doi:10.48550/arxiv.2603.02447 2026

[28] [28]

Rissanen, M

S. Rissanen, M. Heinonen, and A. Solin, Generative modelling with inverse heat dissipation, arXiv preprint arXiv:2206.13397 10.48550/arXiv.2206.13397 (2022)

work page doi:10.48550/arxiv.2206.13397 2022

[29] [29]

C. Berg, J. P. R. Christensen, and P. Ressel,Harmonic analysis on semigroups: theory of positive definite and related functions, Vol. 100 (Springer, 1984). 9

1984

[30] [30]

Huang, C

X. Huang, C. Salaun, C. Vasconcelos, C. Theobalt, C. Oztireli, and G. Singh, Blue noise for diffusion models, inACM SIGGRAPH 2024 conference papers(2024) pp. 1–11

2024

[31] [31]

Mat´ ern,Spatial variation(Springer Science & Busi- ness Media, 2013)

B. Mat´ ern,Spatial variation(Springer Science & Busi- ness Media, 2013)

2013

[32] [32]

C. K. Williams and C. E. Rasmussen,Gaussian processes for machine learning, Vol. 2 (MIT press Cambridge, MA, 2006)

2006

[33] [33]

Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, Score-based generative modeling through stochastic differential equations, arXiv preprint arXiv:2011.13456 10.48550/arXiv.2011.13456 (2020)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2011.13456 2011

[34] [34]

Lipman, R

Y. Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le, Flow matching for generative modeling, inInter- national Conference on Learning Representations(2023)

2023

[35] [35]

J. Song, C. Meng, and S. Ermon, Denoising diffu- sion implicit models, arXiv preprint arXiv:2010.02502 10.48550/arXiv.2010.02502 (2020)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2010.02502 2010

[36] [36]

Progressive Distillation for Fast Sampling of Diffusion Models

T. Salimans and J. Ho, Progressive distillation for fast sampling of diffusion models, arXiv preprint arXiv:2202.00512 10.48550/arXiv.2202.00512 (2022)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2202.00512 2022

[37] [37]

Krizhevsky, G

A. Krizhevsky, G. Hinton,et al.,Learning multiple layers of features from tiny images, Tech. Rep. (2009)

2009

[38] [38]

Ronneberger, P

O. Ronneberger, P. Fischer, and T. Brox, U-net: Con- volutional networks for biomedical image segmentation, inInternational Conference on Medical image computing and computer-assisted intervention(Springer, 2015) pp. 234–241

2015

[39] [39]

A. Q. Nichol and P. Dhariwal, Improved denoising diffu- sion probabilistic models, inInternational Conference on Machine Learning(PMLR, 2021) pp. 8162–8171

2021

[40] [40]

Zou, Denoising diffusion probabilis- tic model,https://github.com/zoubohao/ DenoisingDiffusionProbabilityModel-ddpm-(2021), gitHub repository

B. Zou, Denoising diffusion probabilis- tic model,https://github.com/zoubohao/ DenoisingDiffusionProbabilityModel-ddpm-(2021), gitHub repository

2021

[41] [41]

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 10.48550/arXiv.1412.6980 (2014)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1412.6980 2014

[42] [42]

GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, GANs trained by a two time-scale update rule converge to a local nash equilibrium, Ad- vances in neural information processing systems30, 10.48550/arXiv.1706.08500 (2017)

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1706.08500 2017

[43] [43]

P. C. Martin, E. D. Siggia, and H. A. Rose, Statistical dynamics of classical systems, Physical Review A8, 423 (1973)

1973

[44] [44]

H.-K. Janssen, On a lagrangean for classical field dynam- ics and renormalization group calculations of dynamical critical properties, Zeitschrift f¨ ur Physik B Condensed Matter23, 377 (1976)

1976

[45] [45]

B. D. Anderson, Reverse-time diffusion equation mod- els, Stochastic Processes and their Applications12, 313 (1982)

1982

[46] [46]

Song and P

Y. Song and P. Dhariwal, Improved techniques for train- ing consistency models, inInternational Conference on Learning Representations, Vol. 2024 (2024) pp. 15078– 15097

2024

[47] [47]

Karras, M

T. Karras, M. Aittala, T. Aila, and S. Laine, Elucidat- ing the design space of diffusion-based generative models, inAdvances in Neural Information Processing Systems, Vol. 35 (2022) pp. 26565–26577

2022

[48] [48]

Gardiner,Stochastic methods, Vol

C. Gardiner,Stochastic methods, Vol. 4 (Springer Berlin Heidelberg, 2009)

2009

[49] [49]

Altland and B

A. Altland and B. D. Simons,Condensed matter field theory(Cambridge university press, 2010). 10 Appendix A: Derivation of the Power Spectrum ODE We derive Eq. (15) from first principles. The starting point is the VP-type reverse SDE in real space: dx(r, t) = 1 2 β(t)x(r, t) +β(t)s θ(r;{x}, t) dt+ p β(t)dw(r, t),(A1) wheres θ =∇ x logp t(x) is the score fu...

2010

[50] [50]

Fourier transform Define the unitary discrete Fourier transform (DFT): ˜x(k, t) = 1 N d/2 X r x(r, t)e −ik·r.(A2) Since the DFT is linear, applying it to Eq. (A1) yields d˜x(k) = 1 2 β˜x(k) +β˜sθ(k;{˜x}) dt+ q β ˜Σ(k)d˜w(k),(A3) where ˜Σ(k) is the Fourier eigenvalue of the noise covariance (˜Σ = 1 for white noise), and the transformed noise satisfies ⟨d˜w...

[51] [51]

A translation-invariant operator is diagonalized by the DFT with eigenvalues Λ(k, t) =P r J(r)e −ik·r

Linearization of the score function For data drawn from a translationally invariant distribution, the JacobianJ(r,r ′) =∂s θ(r)/∂x(r′) depends only onr−r ′ in a statistical sense. A translation-invariant operator is diagonalized by the DFT with eigenvalues Λ(k, t) =P r J(r)e −ik·r. Thus the first-order Taylor expansion of ˜sθ in Fourier space is mode-diag...

[52] [52]

Since⟨d˜w⟩= 0, the mean satisfiesdm/dt= µ m+βc, and the fluctuation obeys [48]: dy=µ y dt+ q β ˜Σd˜w.(A6) Fork̸= 0,yis complex:y=y R +iy I

Itˆ o’s lemma for the power spectrum Define the meanm(t)≡ ⟨˜x(t)⟩and fluctuationy(t)≡˜x(t)−m(t). Since⟨d˜w⟩= 0, the mean satisfiesdm/dt= µ m+βc, and the fluctuation obeys [48]: dy=µ y dt+ q β ˜Σd˜w.(A6) Fork̸= 0,yis complex:y=y R +iy I. The complex noise decomposes asd˜w= (dw R +i dw I)/ √ 2, so: dyR =µ y R dt+ q β ˜Σ/2dw R, dy I =µ y I dt+ q β ˜Σ/2dw I ,...

[53] [53]

Relation betweenε θ andγ In DDPM, the noise-prediction network satisfiesε θ =− √1−¯αt sθ, wheres θ is the score function. Under the linearization (A4), in Fourier space: ˜sθ(k) =−γ(k, t) ˜xt(k) +c(k, t),(B1) so that ˜εθ(k) = √ 1−¯αt γ(k, t) ˜xt(k) + const.(B2) Thereforeγ(k, t) equals the slope of ˜ε θ regressed on ˜xt, divided by √1−¯αt: γ(k, t) = 1√1−¯αt...

[54] [54]

Discrete variance propagation The DDPM reverse update rule is: xt−1 = 1√αt xt − 1−α t√1−¯αt εθ(xt, t) +σ t z,(B4) wherez∼ N(0,Σ). Under linearization (B2), the Fourier-space update becomes: ˜xt−1(k) = 1−(1−α t)γ(k, t)√αt ˜xt(k) + mean-field +σt ˜z(k).(B5) Taking the variance (fluctuation part only) yields the discrete power-spectrum recurrence: Dt−1(k) = ...

[55] [55]

Its 2D Fourier transform (power spectrum) is: ˜ΣMat(k) =σ 2 (k2 +κ 2)−(ν+1), κ≡1/ξ.(D2)

Mat´ ern covariance and power spectrum The Mat´ ern covariance inddimensions is: Cν(r) = σ2 2ν−1Γ(ν) r ξ ν Kν r ξ ,(D1) whereK ν is the modified Bessel function of the second kind,ξis the correlation length, andν >0 controls smoothness. Its 2D Fourier transform (power spectrum) is: ˜ΣMat(k) =σ 2 (k2 +κ 2)−(ν+1), κ≡1/ξ.(D2)

[56] [56]

High-frequency asymptotic Fork≫κ(which holds for all observable modesk∈[1,14] on a 32×32 grid withκ≈1): ˜ΣMat(k)≈σ 2 k−2(ν+1).(D3) Matching to the measured data spectrumP data(k)∝k −α gives: α= 2(ν+ 1) =⇒ν= α−2 2 .(D4) 13

[57] [57]

Our kernelC(r) = (r+ 1) −η ∼r −η for larger

Real-space asymptotic and envelope matching The large-rasymptotic of the Mat´ ern correlation is: Cν(r)∼A r ν−1/2 e−r/ξ, r→ ∞.(D5) On a finite lattice (N= 32), whenκis small, the exponential factore −r/ξ varies slowly over the accessible range r∈[1, N], and the correlation shape is dominated by the algebraic enveloper ν−1/2. Our kernelC(r) = (r+ 1) −η ∼r ...

[58] [58]

On a finite grid, the relevant frequency scale is the mid-frequency rangek ∗ ≈3–7 where signal–noise competition is strongest

Sub-leading correction The leading-order prediction uses the asymptotic (k→ ∞) spectral exponentα ∞. On a finite grid, the relevant frequency scale is the mid-frequency rangek ∗ ≈3–7 where signal–noise competition is strongest. The Mat´ ern local log-slope is: αloc(k) = 2(ν+ 1) k2 k2 +κ 2 < α ∞.(D9) Usingα eff(k∗) in place ofα ∞ gives the corrected formul...

[59] [59]

Step 1(noise probability):P[ξ]∝exp − 1 2 R T 0 ξ2 dt

Construction of the MSRJD action For a singlek-mode (suppressingklabels), the linearized reverse SDE (A5) for the real component reads: ˙x=µ(t)x+f(t) + σ(t)√ 2 ξ(t),(E1) whereσ(t) = q β(t) ˜Σ andξis real white noise. Step 1(noise probability):P[ξ]∝exp − 1 2 R T 0 ξ2 dt . 14 Step 2(enforce SDE viaδ-function): Insert 1 = R Dˆpexp i R ˆp[ ˙x−µx−f− σ√ 2 ξ]dt ...

[60] [60]

Propagators The retarded (causal) Green’s function satisfies [∂ t −µ(t)]G R(t, t′) =δ(t−t ′) withG R(t, t′) = 0 fort < t ′: GR(t, t′) =θ(t−t ′) exp Z t t′ µ(τ)dτ .(E4) The Keldysh propagator (equal-time limit gives the power spectrum): GK(t, t′) = Z dτ G R(t, τ)σ(τ) 2 GR(t′, τ) ∗.(E5) Settingt=t ′:D(t) =G K(t, t) = R t −∞ |GR(t, τ)|2 σ(τ) 2 dτ

[61] [61]

Substitutingµ=β( 1 2 −γ) andσ 2 =β ˜Σ: dD dt =β ˜Σ + 2β 1 2 −γ D=β (1−2γ)D+ ˜Σ .(E7) This is identical to Eq

Recovery of the power-spectrum ODE DifferentiatingD(t) =G K(t, t) using the Leibniz rule andG R(t, t) = 1: dD dt =|G R(t, t)|2σ2(t) + Z t −∞ ∂ ∂t |GR(t, τ)|2 σ2(τ)dτ =σ 2(t) + 2µ(t) Z t −∞ |GR(t, τ)|2 σ2(τ)dτ =σ 2(t) + 2µ(t)D(t),(E6) where we used∂ tGR(t, τ) =µ(t)G R(t, τ) fort > τ. Substitutingµ=β( 1 2 −γ) andσ 2 =β ˜Σ: dD dt =β ˜Σ + 2β 1 2 −γ D=β (1−2γ)...

[62] [62]

This follows because the transla- tion operatorT a acts as ˜x(k)→e ik·a˜x(k), and equivariance ˜sθ(k;{T ax}) =e ik·a˜sθ(k;{x}) requirese ik·a =e i(k1+k2)·a for alla

Higher-order expansion of the score function Beyond the linear approximation (A4), the score function admits a systematic expansion: ˜sθ(k) =−γ(k, t) ˜x(k) +c(k, t) + X k1+k2=k V3(k;k 1,k 2;t) ˜x(k1)˜x(k2) +· · ·(F1) 15 where the three-wave coupling vertex is: V3(k;k 1,k 2;t) = 1 2 ∂2˜sθ(k) ∂˜x(k1)∂˜x(k2) .(F2) Translational invariance enforces momentum c...

[63] [63]

Interacting MSRJD action The full action decomposes asS=S 0 +S int, whereS 0 is the Gaussian action (E3) summed over allk, and: Sint =− X k Z dt β(t) ˜p∗(k) X k1+k2=k V3(k;k 1,k 2) ˜x(k1)˜x(k2) +· · ·(F3) EachV 3 vertex connects one response field ˜p∗(k) to two physical fields ˜x(k1),˜x(k2), carrying algebraic weight−β V 3

[64] [64]

Feynman rules The free propagators fromS 0 are: GR(ω;k) =⟨˜x(k) ˜p∗(k)⟩0 = −1 iω+µ(k) ,(F4) GK(ω;k) =⟨˜x(k) ˜x∗(k)⟩0 = β ˜Σ(k) ω2 +µ(k) 2 ,(F5) ⟨˜p˜p∗⟩0 = 0.(F6) The vanishing of⟨˜p˜p∗⟩ensures that every closed loop must contain at least oneG K line—a causality constraint intrinsic to the MSRJD formalism

[65] [65]

Single-loop self-energy The lowest-order correction to the retarded propagator is the single-loop (“sunset”) diagram with twoV 3 vertices: δΣR(k, ω) = 2β2X k1 |V3(k;k 1,k−k 1)|2 Z dω1 2π GK(ω1;k 1)G R(ω−ω1;k−k 1).(F7) The factor of 2 is the symmetry factor from exchanging the two ˜xlegs at each vertex

[66] [66]

Withµ 1 ≡µ(k 1),µ q ≡µ(k−k 1),σ 2 1 ≡β ˜Σ(k1): I= Z dω1 2π σ2 1 ω2 1 +µ 2 1 · −1 i(ω−ω 1) +µ q .(F8) The integrand has poles atω 1 =±iµ 1 (fromG K) andω 1 =ω+iµ q (fromG R)

Frequency integral Theω 1 integral is evaluated by contour integration. Withµ 1 ≡µ(k 1),µ q ≡µ(k−k 1),σ 2 1 ≡β ˜Σ(k1): I= Z dω1 2π σ2 1 ω2 1 +µ 2 1 · −1 i(ω−ω 1) +µ q .(F8) The integrand has poles atω 1 =±iµ 1 (fromG K) andω 1 =ω+iµ q (fromG R). Closing the contour in the upper half-plane (forµ 1 <0, the pole atω 1 =iµ 1 lies in the upper half-plane): I= ...

[67] [67]

SinceδΣ R <0 (Eq

Physical interpretation The self-energyδΣ R modifies the effective drift:µ eff(k) =µ(k) +δΣ R(k,0). SinceδΣ R <0 (Eq. F10), the non- linear couplingenhancesthe effective restoring force—other modes’ fluctuations, mediated byV 3, provide additional damping. The full (dressed) propagator satisfies the Dyson equation: GR full(ω;k) = −1 iω+µ(k) +δΣ R(k, ω) ,(...