arxiv: 2605.07950 · v1 · submitted 2026-05-08 · 💻 cs.LG

Recognition: no theorem link

Slowly Annealed Langevin Dynamics: Theory and Applications to Training-Free Guided Generation

Atsushi Nitanda, Dake Bu, Tanya Veeravalli, Yueming Lyu

Authors on Pith no claims yet

Pith reviewed 2026-05-11 02:49 UTC · model grok-4.3

classification 💻 cs.LG

keywords Langevin dynamicsannealed samplingguided generationdiffusion modelsconvergence guaranteestraining-free methodsKL divergence

0 comments

The pith

Slow annealing of Langevin dynamics improves tracking of evolving targets with provable convergence

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that by introducing a slowdown in the annealing process of Langevin dynamics, the sampler can more accurately follow a sequence of changing target distributions, leading to better approximation of the final target. This is shown through non-asymptotic bounds derived from a KL differential inequality that highlights how slowdown contracts the distance to intermediate targets and simplifies the path complexity. The approach is particularly useful for training-free guided generation in diffusion models, where an extension called Velocity-Aware SALD incorporates the pretrained model's marginals to correct for guidance-induced deviations. A reader would care because it provides theoretical grounding for why and how to adjust sampling speed in generative models without needing to retrain them.

Core claim

Slowly Annealed Langevin Dynamics tracks moving target distributions more effectively by time slowdown, establishing non-asymptotic convergence guarantees via a KL differential inequality; Velocity-Aware SALD applies this to correct additional bias from guidance in pretrained score-based generative models, yielding a principled training-free framework with convergence analysis.

What carries the argument

The slowdown mechanism in SALD, which contracts intermediate targets and reduces the complexity of the annealing path to improve tracking accuracy.

If this is right

Non-asymptotic convergence guarantees are provided for the slowed dynamics through the KL inequality.
Slowdown improves tracking by contracting intermediate targets and lowering path complexity.
VA-SALD corrects guidance deviations using the underlying marginal distributions of the pretrained model.
The framework applies to diffusion-based and related generative models with clarified roles for functional inequalities and bias.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The theory could inspire adaptive slowdown schedules based on estimated path complexity in other sampling algorithms.
It suggests that similar slowing techniques might enhance performance in non-Langevin stochastic processes for distribution tracking.
Practical implementations may allow for more efficient conditional sampling in large-scale generative tasks without additional model training.

Load-bearing premise

The KL differential inequality and contraction properties require the score functions and guidance terms to meet certain smoothness and Lipschitz conditions.

What would settle it

Observing no reduction in the KL divergence or no improvement in generated sample quality when increasing the slowdown factor on a guided diffusion task would challenge the central claim.

Figures

Figures reproduced from arXiv: 2605.07950 by Atsushi Nitanda, Dake Bu, Tanya Veeravalli, Yueming Lyu.

**Figure 2.** Figure 2: Guided generation on "lion" with guidance of black-box reward functions based on neural Aesthetic [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Reverse-indexed VP marginal family for the unguided two-Gaussian experiment. The forward VP diffusion contracts the mixture means toward the origin, so the reverse initialization is close to a centered Gaussian while the terminal law remains bimodal. 11.1.1 Two-moons guided two-Gaussian VP diffusion. The first guided task starts from a two-component Gaussian data distribution and uses a two-moons guide. Th… view at source ↗

**Figure 4.** Figure 4: Unguided SALD overview. The KL trajectory and terminal KL confirm that increasing the slow-down budget reduces the mismatch to the terminal two-Gaussian target until the finite-particle and discretization floor dominates. DOIT [35] adaptation and budget matching. We adapt DOIT [35] (Opensource at DOIT) as a training-free local approximation to the Doob h-transform. For a base reverse transition density ϕθ(… view at source ↗

**Figure 5.** Figure 5: Unguided SALD terminal samples. Larger budgets produce visibly sharper recovery of the two target modes. The corresponding local Doob direction is gbDoob = X M m=1 wm p zm ηDOITβ(T − tk) , (85) and the adapted update is Xk+1 = Xk + ηDOIT (bbase(Xk, tk) + γβ(T − tk)gbDoob) + p ηDOITβ(T − tk) ξk. (86) Since DOIT spends M proposal evaluations per particle and step, a fair particle-step comparison uses approxi… view at source ↗

**Figure 6.** Figure 6: SALD on the two-moons guided two-Gaussian task. SALD follows the expected budget trend: larger r improves the terminal-target KL, with the curve eventually flattening near the Monte Carlo floor. The guide penalizes a chosen set P of modes. We use the four left-half modes, i.e. those with negative first coordinate, and define fP (x) = λ X j∈P exp − ∥x − cj∥ 2 2ℓ 2 f ! , (91) with gradient ∇fP (x) = − λ ℓ 2 … view at source ↗

**Figure 7.** Figure 7: SALD samples for the two-moons guided task. Terminal samples increasingly align with the two-moons guided target as the budget increases [PITH_FULL_IMAGE:figures/full_fig_p041_7.png] view at source ↗

**Figure 8.** Figure 8: VA-SALD on the two-moons guided task. Incorporating the VP velocity field yields low terminal KL across the tested budgets and a stable guidance-objective trajectory [PITH_FULL_IMAGE:figures/full_fig_p042_8.png] view at source ↗

**Figure 9.** Figure 9: VA-SALD samples for the two-moons guided task. The sample panels show that VA-SALD reaches the guided terminal geometry at small and large budgets. 42 [PITH_FULL_IMAGE:figures/full_fig_p042_9.png] view at source ↗

**Figure 10.** Figure 10: DOIT on the two-moons guided task. DOIT uses the same VP reverse SDE base sampler and the same total proposal budget, but its local Doob correction does not close the terminal KL gap as effectively as SALD or VA-SALD. 43 [PITH_FULL_IMAGE:figures/full_fig_p043_10.png] view at source ↗

**Figure 11.** Figure 11: Eight-Gaussian guided terminal target. Blue markers denote unpenalized modes, and orange crosses denote penalized modes. Contours show the base and guided terminal densities. 44 [PITH_FULL_IMAGE:figures/full_fig_p044_11.png] view at source ↗

**Figure 12.** Figure 12: SALD on the eight-Gaussian mode-penalty task. SALD reduces the terminal KL as r increases and then approaches a plateau. 45 [PITH_FULL_IMAGE:figures/full_fig_p045_12.png] view at source ↗

**Figure 13.** Figure 13: VA-SALD on the eight-Gaussian mode-penalty task. The terminal KL is low and nearly budget-insensitive, indicating that the velocity-aware reverse dynamics already align well with this guided target. 46 [PITH_FULL_IMAGE:figures/full_fig_p046_13.png] view at source ↗

**Figure 14.** Figure 14: DOIT on the eight-Gaussian mode-penalty task. The local proposal-based Doob correction improves the guide-related statistic but leaves a larger terminal KL than SALD and VA-SALD at the same computational budgets. 47 [PITH_FULL_IMAGE:figures/full_fig_p047_14.png] view at source ↗

**Figure 15.** Figure 15: Generated Images on prompt "A glowing dragon soaring through floating islands, leaving behind a trail [PITH_FULL_IMAGE:figures/full_fig_p052_15.png] view at source ↗

**Figure 16.** Figure 16: Generated Images on prompt "In the ink painting style, a naive giant panda is sitting on the majestic [PITH_FULL_IMAGE:figures/full_fig_p053_16.png] view at source ↗

**Figure 17.** Figure 17: Generated Images on prompt "The art deco music festival poster has a fox made of polished brass in [PITH_FULL_IMAGE:figures/full_fig_p054_17.png] view at source ↗

**Figure 18.** Figure 18: Generated Images on prompt "quick doodle of a guy, medium hair with long bangs, hd detailed detailed" [PITH_FULL_IMAGE:figures/full_fig_p055_18.png] view at source ↗

**Figure 19.** Figure 19: Guided generation on "bear" with guidance parameter [PITH_FULL_IMAGE:figures/full_fig_p056_19.png] view at source ↗

**Figure 20.** Figure 20: Guided generation on "bear" with guidance parameter [PITH_FULL_IMAGE:figures/full_fig_p057_20.png] view at source ↗

**Figure 21.** Figure 21: Guided generation on "wolf" with guidance parameter [PITH_FULL_IMAGE:figures/full_fig_p057_21.png] view at source ↗

**Figure 22.** Figure 22: Guided generation on "wolf" with guidance parameter [PITH_FULL_IMAGE:figures/full_fig_p058_22.png] view at source ↗

**Figure 23.** Figure 23: Guided generation on "hippo" with guidance parameter [PITH_FULL_IMAGE:figures/full_fig_p058_23.png] view at source ↗

**Figure 24.** Figure 24: Guided generation on "hippo" with guidance parameter [PITH_FULL_IMAGE:figures/full_fig_p059_24.png] view at source ↗

**Figure 25.** Figure 25: Guided generation on "lion" with guidance parameter [PITH_FULL_IMAGE:figures/full_fig_p059_25.png] view at source ↗

**Figure 26.** Figure 26: Guided generation on "lion" with guidance parameter [PITH_FULL_IMAGE:figures/full_fig_p060_26.png] view at source ↗

read the original abstract

We study Slowly Annealed Langevin Dynamics (SALD), a sampler for tracking a path of moving target distributions and approximating the terminal target through time slowdown. We establish non-asymptotic convergence guarantees via a KL differential inequality, showing that slowdown improves tracking through contraction of intermediate targets and the complexity of the path. Motivated by training-free guided generation with pretrained score-based generative models, we further introduce Velocity-Aware SALD (VA-SALD), which explicitly incorporates the underlying marginal distributions of the pretrained model and uses slowdown to correct the additional deviation induced by guidance. This yields a principled framework for training-free guided generation for diffusion-based and related generative model families, together with convergence guarantees that clarify the roles of intermediate functional inequalities and guidance bias. Code is available at https://github.com/anitan0925/sald.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a slowed Langevin sampler plus a velocity correction that uses pretrained marginals to handle training-free guidance, backed by a KL differential inequality, but the bounds look vulnerable to guidance-induced growth in Lipschitz constants.

read the letter

The main thing here is SALD, which slows the annealing schedule in Langevin dynamics to track a path of moving targets more reliably, and then VA-SALD, which adds an explicit correction term based on the pretrained model's marginals to offset the extra drift from guidance. They derive a non-asymptotic KL inequality showing that the slowdown contracts the path length and improves the tracking error. The code is public, which is helpful for anyone wanting to test it on diffusion models.

Referee Report

3 major / 2 minor

Summary. The paper proposes Slowly Annealed Langevin Dynamics (SALD) to track a path of moving target distributions by rescaling time to slow the dynamics, and derives non-asymptotic convergence guarantees via a KL differential inequality showing that slowdown contracts the effective path complexity and improves tracking of intermediate targets. It introduces Velocity-Aware SALD (VA-SALD) that incorporates the marginal distributions of a pretrained score-based model to offset the deviation from guidance, yielding a training-free framework for guided generation together with convergence guarantees that clarify the roles of functional inequalities and guidance bias.

Significance. If the non-asymptotic bounds hold under the stated conditions, the work supplies a principled, training-free mechanism for correcting guidance bias in diffusion models and related families, together with explicit dependence of the tracking error on the slowdown schedule. The open-source code supports reproducibility of the empirical claims.

major comments (3)

[§3 (KL differential inequality)] The KL differential inequality (abstract and §3) is stated as d/dt KL(ρ_t || π_t) ≤ -c(t) KL(ρ_t || π_t) + error(t). For slowdown to guarantee contraction, the integrated error must be controlled uniformly in the slowed time variable; the derivation must therefore bound the Lipschitz constant of the guidance drift (and of the time-dependent score) independently of the slowdown factor. The manuscript does not exhibit this uniform bound for the VA-SALD case where the guidance weight can induce time-dependent growth near t=0.
[§4 (VA-SALD definition and guarantees)] In the VA-SALD construction (§4), the velocity-aware correction term is added to the drift. The proof that this term does not inflate the Lipschitz constant of the overall vector field beyond what the slowdown schedule can compensate is missing; without an explicit estimate relating the guidance weight, the marginal velocity, and the resulting Lip constant, the claimed improvement in tracking error cannot be verified.
[Theorem 3.1 / Assumption list] The non-asymptotic guarantee is obtained by integrating the differential inequality along the slowed path. The manuscript must state the precise smoothness/Lipschitz assumptions on the pretrained score and on the guidance function that make the error term integrable; these assumptions are only implicit in the abstract and are not listed as a numbered assumption set before the main theorem.

minor comments (2)

[§2] Notation for the time-dependent target π_t and the slowed process should be introduced once in §2 and used consistently; the current alternation between “intermediate targets” and “moving targets” is occasionally ambiguous.
[§3] The statement that slowdown “contracts the complexity of the path” would benefit from a short remark clarifying whether this contraction is measured in Wasserstein distance, in total variation of the score, or solely through the integrated Lip constant appearing in the error term.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We sincerely thank the referee for the thorough review and valuable feedback on our work. The comments have helped us identify areas where the presentation and proofs can be strengthened. We address each major comment below and outline the revisions we will make to the manuscript.

read point-by-point responses

Referee: [§3 (KL differential inequality)] The KL differential inequality (abstract and §3) is stated as d/dt KL(ρ_t || π_t) ≤ -c(t) KL(ρ_t || π_t) + error(t). For slowdown to guarantee contraction, the integrated error must be controlled uniformly in the slowed time variable; the derivation must therefore bound the Lipschitz constant of the guidance drift (and of the time-dependent score) independently of the slowdown factor. The manuscript does not exhibit this uniform bound for the VA-SALD case where the guidance weight can induce time-dependent growth near t=0.

Authors: We agree with the referee that a uniform bound on the Lipschitz constants, independent of the slowdown factor, is essential to ensure the error term integrates properly in the slowed time. While the original derivation relied on the regularity of the score and guidance to control growth, we did not explicitly demonstrate the uniformity for VA-SALD near t=0. In the revised manuscript, we will introduce an additional lemma establishing that the Lipschitz constant of the combined drift (including the velocity-aware correction) is bounded uniformly with respect to the slowdown parameter. This bound will leverage the properties of the pretrained marginals and ensure the contraction holds as claimed. revision: yes
Referee: [§4 (VA-SALD definition and guarantees)] In the VA-SALD construction (§4), the velocity-aware correction term is added to the drift. The proof that this term does not inflate the Lipschitz constant of the overall vector field beyond what the slowdown schedule can compensate is missing; without an explicit estimate relating the guidance weight, the marginal velocity, and the resulting Lip constant, the claimed improvement in tracking error cannot be verified.

Authors: Thank you for highlighting this gap. The original manuscript did not provide the explicit estimate for how the velocity-aware correction affects the overall Lipschitz constant. We will add this estimate in the revised version, showing that the Lipschitz constant of the correction term is proportional to the guidance weight and the supremum norm of the marginal velocity. We will then demonstrate that the slowdown schedule can be selected to ensure the contraction dominates any such inflation, thereby verifying the improvement in tracking error. revision: yes
Referee: [Theorem 3.1 / Assumption list] The non-asymptotic guarantee is obtained by integrating the differential inequality along the slowed path. The manuscript must state the precise smoothness/Lipschitz assumptions on the pretrained score and on the guidance function that make the error term integrable; these assumptions are only implicit in the abstract and are not listed as a numbered assumption set before the main theorem.

Authors: We acknowledge that the assumptions were presented implicitly rather than as an explicit numbered list. In the revised manuscript, we will add a dedicated subsection or paragraph immediately preceding Theorem 3.1 that lists the precise assumptions on the smoothness and Lipschitz continuity of the pretrained score function and the guidance function. This will clarify the conditions under which the error term is integrable and the non-asymptotic bound holds. revision: yes

Circularity Check

0 steps flagged

No significant circularity: KL differential inequality derivation is analytically self-contained

full rationale

The paper's core claim derives non-asymptotic convergence for SALD and VA-SALD from a KL differential inequality d/dt KL(ρ_t || π_t) ≤ -c(t) KL + error, with slowdown rescaling time to contract the path. This is a standard functional-inequality argument under smoothness/Lipschitz assumptions on scores and guidance; the contraction and error bounds are obtained from the dynamics rather than by redefining targets in terms of the inequality itself or by fitting parameters to the same data. No self-citation chains, ansatzes smuggled via prior work, or renaming of known results appear as load-bearing steps. The VA-SALD correction for guidance bias is explicitly constructed from pretrained marginals and does not reduce the guarantee to a tautology. The derivation therefore remains independent of its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; full manuscript would be required to audit the KL inequality assumptions and any Lipschitz or smoothness conditions.

pith-pipeline@v0.9.0 · 5444 in / 1078 out tokens · 30501 ms · 2026-05-11T02:49:29.175899+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 5 internal anchors

[1]

Springer, 2005

Luigi Ambrosio, Nicola Gigli, and Giuseppe Savaré.Gradient flows: in metric spaces and in the space of probability measures. Springer, 2005

work page 2005
[2]

Dimension-free multimodal sampling via preconditioned annealed langevin dynamics.arXiv preprint arXiv:2602.01449, 2026

Lorenzo Baldassari, Josselin Garnier, Knut Solna, and Maarten V de Hoop. Dimension-free multimodal sampling via preconditioned annealed langevin dynamics.arXiv preprint arXiv:2602.01449, 2026

work page arXiv 2026
[3]

Oxford University Press, 2013

Stéphane Boucheron, Gábor Lugosi, and Pascal Massart.Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013

work page 2013
[4]

Classifier-free guidance is a predictor-corrector.Transactions on Machine Learning Research, 2025

Arwen Bradley and Preetum Nakkiran. Classifier-free guidance is a predictor-corrector.Transactions on Machine Learning Research, 2025

work page 2025
[5]

Diffusion annealed langevin dynamics: a theoretical study.arXiv preprint arXiv:2511.10406, 2025

Patrick Cattiaux, Paula Cordero-Encinar, and Arnaud Guillin. Diffusion annealed langevin dynamics: a theoretical study.arXiv preprint arXiv:2511.10406, 2025

work page arXiv 2025
[6]

Inference-time alignment for diffusion models via variationally stable doob’s matching.arXiv preprint arXiv:2601.06514, 2026

Jinyuan Chang, Chenguang Duan, Yuling Jiao, Yi Xu, and Jerry Zhijian Yang. Inference-time alignment for diffusion models via variationally stable doob’s matching.arXiv preprint arXiv:2601.06514, 2026. 10

work page arXiv 2026
[7]

Provable convergence and limitations of geometric tempering for langevin dynamics

Omar Chehab, Anna Korba, Austin Stromme, and Adrien Vacher. Provable convergence and limitations of geometric tempering for langevin dynamics. InThe Thirteenth International Conference on Learning Representations, 2025

work page 2025
[8]

An overview of diffusion models: Ap- plications, guided generation, statistical rates and optimization.arXiv preprint arXiv:2404.07771,

Minshuo Chen, Song Mei, Jianqing Fan, and Mengdi Wang. An overview of diffusion models: Applications, guided generation, statistical rates and optimization.arXiv preprint arXiv:2404.07771, 2024

work page arXiv 2024
[9]

Non-asymptotic analysis of diffusion annealed langevin monte carlo for generative modelling.arXiv preprint arXiv:2502.09306, 2025

Paula Cordero-Encinar, O Deniz Akyildiz, and Andrew B Duncan. Non-asymptotic analysis of diffusion annealed langevin monte carlo for generative modelling.arXiv preprint arXiv:2502.09306, 2025

work page arXiv 2025
[10]

Scaling rectified flow transformers for high-resolution image synthesis

Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling rectified flow transformers for high-resolution image synthesis. InForty-first international conference on machine learning, 2024

work page 2024
[11]

arXiv preprint arXiv:1812.00793 , year=

Rong Ge, Holden Lee, and Andrej Risteski. Simulated tempering langevin monte carlo ii: An improved proof using soft markov chain decomposition.arXiv preprint arXiv:1812.00793, 2018

work page arXiv 2018
[12]

Mean Flows for One-step Generative Modeling

Zhengyang Geng, Mingyang Deng, Xingjian Bai, J Zico Kolter, and Kaiming He. Mean flows for one-step generative modeling.arXiv preprint arXiv:2505.13447, 2025

work page internal anchor Pith review arXiv 2025
[13]

Provable benefit of annealed langevin monte carlo for non-log-concave sampling

Wei Guo, Molei Tao, and Yongxin Chen. Provable benefit of annealed langevin monte carlo for non-log-concave sampling. InThe Thirteenth International Conference on Learning Representations, 2025

work page 2025
[14]

Training-free guidance beyond differentiability: Scalable path steering with tree search in diffusion and flow models.arXiv preprint arXiv:2502.11420, 2025

Yingqing Guo, Yukang Yang, Hui Yuan, and Mengdi Wang. Training-free guidance beyond differentiability: Scalable path steering with tree search in diffusion and flow models.arXiv preprint arXiv:2502.11420, 2025

work page arXiv 2025
[15]

Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

work page 2020
[16]

Classifier-Free Diffusion Guidance

Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance.arXiv preprint arXiv:2207.12598, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[17]

arXiv preprint arXiv:2512.04985 , year =

Yuchen Jiao, Yuxin Chen, and Gen Li. Towards a unified framework for guided diffusion models.arXiv preprint arXiv:2512.04985, 2025

work page arXiv 2025
[18]

Direct distributional optimization for provablealignmentofdiffusionmodels

Ryotaro Kawata, Kazusato Oko, Atsushi Nitanda, and Taiji Suzuki. Direct distributional optimization for provablealignmentofdiffusionmodels. InThe Thirteenth International Conference on Learning Representations, 2025

work page 2025
[19]

Test-time alignment of diffusion models without reward over-optimization

Sunwoo Kim, Minkyu Kim, and Dongmin Park. Test-time alignment of diffusion models without reward over-optimization. InThe Thirteenth International Conference on Learning Representations, 2025

work page 2025
[20]

Convergence for score-based generative modeling with polynomial complexity.Advances in Neural Information Processing Systems 35, 35:22870–22882, 2022

Holden Lee, Jianfeng Lu, and Yixin Tan. Convergence for score-based generative modeling with polynomial complexity.Advances in Neural Information Processing Systems 35, 35:22870–22882, 2022

work page 2022
[21]

Dynamic search for inference-time alignment in diffusion models, 2025

Xiner Li, Masatoshi Uehara, Xingyu Su, Gabriele Scalia, Tommaso Biancalani, Aviv Regev, Sergey Levine, and Shuiwang Ji. Dynamic search for inference-time alignment in diffusion models.arXiv preprint arXiv:2503.02039, 2025

work page arXiv 2025
[22]

Flow Matching for Generative Modeling

Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling.arXiv preprint arXiv:2210.02747, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[23]

Alignment of diffusion models: Fundamentals, challenges, and future.ACM Computing Surveys, 58(9):1–37, 2026

Buhua Liu, Shitong Shao, Bao Li, Lichen Bai, Zhiqiang Xu, Haoyi Xiong, James T Kwok, Sumi Helal, and Zeke Xie. Alignment of diffusion models: Fundamentals, challenges, and future.ACM Computing Surveys, 58(9):1–37, 2026

work page 2026
[24]

Flow-GRPO: Training Flow Matching Models via Online RL

Jie Liu, Gongye Liu, Jiajun Liang, Yangguang Li, Jiaheng Liu, Xintao Wang, Pengfei Wan, Di Zhang, and Wanli Ouyang. Flow-grpo: Training flow matching models via online rl.arXiv preprint arXiv:2505.05470, 2025

work page internal anchor Pith review arXiv 2025
[25]

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

Xingchao Liu, Chengyue Gong, and Qiang Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow.arXiv preprint arXiv:2209.03003, 2022. 11

work page internal anchor Pith review Pith/arXiv arXiv 2022
[26]

Covariance-adaptive sequential black-box optimization for diffusion targeted generation.arXiv preprint arXiv:2406.00812, 2024

Yueming Lyu, Kim Yong Tan, Yew Soon Ong, and Ivor W Tsang. Covariance-adaptive sequential black-box optimization for diffusion targeted generation.arXiv preprint arXiv:2406.00812, 2024

work page arXiv 2024
[27]

Black-box optimizer with implicit natural gradient.arXiv preprint arXiv:1910.04301, 2019

Yueming Lyu and Ivor W Tsang. Black-box optimizer with implicit natural gradient.arXiv preprint arXiv:1910.04301, 2019

work page arXiv 1910
[28]

Random gradient-free minimization of convex functions.Foundations of Computational Mathematics, 17(2):527–566, 2017

Yurii Nesterov and Vladimir Spokoiny. Random gradient-free minimization of convex functions.Foundations of Computational Mathematics, 17(2):527–566, 2017

work page 2017
[29]

Generative modeling by estimating gradients of the data distribution.Advances in neural information processing systems, 32, 2019

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution.Advances in neural information processing systems, 32, 2019

work page 2019
[30]

Score- based generative modeling through stochastic differential equations

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score- based generative modeling through stochastic differential equations. InThe Ninth International Conference on Learning Representations, 2021

work page 2021
[31]

A stochastic analysis approach to conditional diffusion guidance.Columbia University Preprint, 2024

Wenpin Tang and Renyuan Xu. A stochastic analysis approach to conditional diffusion guidance.Columbia University Preprint, 2024

work page 2024
[32]

Evolvable conditional diffusion.arXiv preprint arXiv:2506.13834, 2025

Zhao Wei, Chin Chun Ooi, Abhishek Gupta, Jian Cheng Wong, Pao-Hsiung Chiu, Sheares Xue Wen Toh, and Yew-Soon Ong. Evolvable conditional diffusion.arXiv preprint arXiv:2506.13834, 2025

work page arXiv 2025
[33]

Posterior sampling by combining diffusion models with annealed langevin dynamics.Advances in Neural Information Processing Systems 38, 2025

Zhiyang Xun, Shivam Gupta, and Eric Price. Posterior sampling by combining diffusion models with annealed langevin dynamics.Advances in Neural Information Processing Systems 38, 2025

work page 2025
[34]

TFG: Unified training-free guidance for diffusion models

Haotian Ye, Haowei Lin, Jiaqi Han, Minkai Xu, Sheng Liu, Yitao Liang, Jianzhu Ma, James Zou, and Stefano Ermon. TFG: Unified training-free guidance for diffusion models. InThe Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024

work page 2024
[35]

Training-free adaptation of diffusion models via doob’sh-transform.arXiv preprint arXiv:2602.16198, 2026

Qijie Zhu, Zeqi Ye, Han Liu, Zhaoran Wang, and Minshuo Chen. Training-free adaptation of diffusion models via doob’sh-transform.arXiv preprint arXiv:2602.16198, 2026. 12 Table of Contents 1 Introduction 1 2 Problem Setup and Methods 2 2.1 Slowly Annealed Langevin Dynamics [13] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Velocity-...

work page arXiv 2026
[36]

tackles the problem of supervised fine-tuning with endogenous conditioning, in which they learn theh-function through a martingale-covariation loss. In the alignment for diffusion models in the Direct Preference Optimization (DPO) setting, a recent work [18] formulates a distributional optimization problem and subsequently enable sampling from the learned...

work page
[37]

There exist constantLπ,space >0such that for allx, y∈R d,t∈[0, T] ∥∇logπ t(x)− ∇logπ t(y)∥ ≤L π,space∥x−y∥

work page
[38]

There exist a measurable functionM and constants Lπ,time > 0such that for every s < s ′, (s, s′ ∈ [0, S]), and everyx∈R d, ∥∇logπ t(s′)(x)− ∇logπ t(s)(x)∥ ≤L π,time(s′ −s)(1 +M(x))

work page
[39]

There existsα′ 0 >0such that Eα′ 0(πt,∇logπ t)<+∞,E α′ 0(πt,1 +M)<+∞,∀t∈[0, T]

work page
[40]

We assumeη2L2 π,space < 1 8. Then, for everyα′ ∈(0, α ′ 0]and everys∈[s k, sk+1], − Z ˆρs⟨δπt(s)(x), As⟩dx≤ 1 4FI(ˆρs∥˜πs) + 2η2α′−1Γ(t(s))KL(ˆρs∥˜πs) + 2η∆(t(s)).(21) Here, Γ(t) := 2L2 π,time + 64L2 π,space ˙s(t)−2 + 1 +η 2L2 π,time , ∆(t) := 64ηL2 π,spaceEα′(πt,∇logπ t) + 2ηL2 π,time(1 + 32η2L2 π,space)Eα′(πt,1 +M) + 16dL 2 π,space. We here present the ...

work page
[41]

There exist constantsLc,space, Lπ,space >0such that for allx, y∈R d,t∈[0, T] ∥ct(x)−c t(y)∥ ≤L c,space∥x−y∥,(45) ∥∇logπ t(x)− ∇logπ t(y)∥ ≤L π,space∥x−y∥.(46)

work page
[42]

There exist a measurable functionM and constants Lc,time, Lπ,time > 0such that for every s < s ′, (s, s′ ∈ [0, S]), and everyx∈R d, ∥ct(s′)(x)−c t(s)(x)∥ ≤L c,time(s′ −s)(1 +M(x)),(47) ∥∇logπ t(s′)(x)− ∇logπ t(s)(x)∥ ≤L π,time(s′ −s)(1 +M(x)).(48)

work page
[43]

There existsα′ 0 >0such that Eα′ 0(πt, ct)<+∞,E α′ 0(πt,∇logπ t)<+∞,E α′ 0(πt,1 +M)<+∞,∀t∈[0, T]

work page
[44]

The art deco music festival poster has a fox made of polished brass in the center of the picture, with smooth lines and elegant posture

We assume 4η2 ˙t(s)Lc,space + ση(t(s))2 2 Lπ,space 2 < 1 2 . Then, for everyα′ ∈(0, α ′ 0]and everys∈[s k, sk+1], − Z ˆρs⟨δVA πt(s)(x), As⟩dx≤ ση(t(s))2 8 FI(ˆρs∥˜πs) + 2η2α′−1Γ(t(s))KL(ˆρs∥˜πs) + 2η∆(t(s)). (49) Here Γ(t) :=σ η(t)−2 ( 4 ˙s(t)−2L2 c,time +σ η(t)4L2 π,time + 8 4 ˙s(t)−2L2 c,space +σ η(t)4L2 π,space 4 ˙s(t)−2 +σ η(t)4 +η 2 4 ˙s(t)−2L2 c,tim...

work page