pith. machine review for the scientific record. sign in

arxiv: 2209.15571 · v3 · submitted 2022-09-30 · 💻 cs.LG · stat.ML

Recognition: 2 theorem links

· Lean Theorem

Building Normalizing Flows with Stochastic Interpolants

Eric Vanden-Eijnden, Michael S. Albergo

Pith reviewed 2026-05-12 06:48 UTC · model grok-4.3

classification 💻 cs.LG stat.ML
keywords normalizing flowsgenerative modelsstochastic interpolantsprobability currentvelocity fielddensity estimationoptimal transportimage generation
0
0 comments X

The pith

A new method learns velocity fields for continuous normalizing flows directly from the probability currents of interpolating densities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes building continuous-time normalizing flows that transport mass between arbitrary base and target densities. The velocity field is taken from the probability current of a time-dependent density that interpolates between them over finite time. Training uses a quadratic loss on this velocity, expressed as expectations that can be estimated from samples, rather than maximum likelihood which requires backpropagation through ODE solvers. This yields flows usable for sampling in either direction and likelihood evaluation along the path. The same framework can be tuned to shorten the transport path and relates to but can bypass diffusion models in favor of simpler ODE dynamics.

Core claim

The velocity field of the normalizing flow is obtained from the probability current of any chosen interpolating density between base and target; the resulting objective is a quadratic loss on the velocity that is directly amenable to empirical estimation from samples, enabling efficient learning without backpropagating through ODE solvers.

What carries the argument

The stochastic interpolant, defined as a time-dependent density bridging base and target in finite time, whose probability current supplies the velocity field for the flow.

Load-bearing premise

There exists an interpolating density whose probability current can be approximated well enough by a neural network to produce a flow that transports mass accurately without high-dimensional instabilities or biases.

What would settle it

Train the network on the quadratic loss using samples from base and target; if the resulting ODE flow does not map held-out base samples to the target distribution with high fidelity on a benchmark such as CIFAR-10, the central claim fails.

read the original abstract

A generative model based on a continuous-time normalizing flow between any pair of base and target probability densities is proposed. The velocity field of this flow is inferred from the probability current of a time-dependent density that interpolates between the base and the target in finite time. Unlike conventional normalizing flow inference methods based the maximum likelihood principle, which require costly backpropagation through ODE solvers, our interpolant approach leads to a simple quadratic loss for the velocity itself which is expressed in terms of expectations that are readily amenable to empirical estimation. The flow can be used to generate samples from either the base or target, and to estimate the likelihood at any time along the interpolant. In addition, the flow can be optimized to minimize the path length of the interpolant density, thereby paving the way for building optimal transport maps. In situations where the base is a Gaussian density, we also show that the velocity of our normalizing flow can also be used to construct a diffusion model to sample the target as well as estimate its score. However, our approach shows that we can bypass this diffusion completely and work at the level of the probability flow with greater simplicity, opening an avenue for methods based solely on ordinary differential equations as an alternative to those based on stochastic differential equations. Benchmarking on density estimation tasks illustrates that the learned flow can match and surpass conventional continuous flows at a fraction of the cost, and compares well with diffusions on image generation on CIFAR-10 and ImageNet $32\times32$. The method scales ab-initio ODE flows to previously unreachable image resolutions, demonstrated up to $128\times128$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a generative modeling framework that constructs continuous-time normalizing flows between arbitrary base and target densities by inferring the velocity field from the probability current of a stochastic interpolant density p_t. This yields a quadratic regression loss on the velocity that is directly estimable from samples, bypassing the need for backpropagation through ODE solvers required in standard maximum-likelihood CNF training. The approach also enables path-length minimization for approximate optimal transport, construction of diffusion models when the base is Gaussian, and likelihood estimation along the trajectory; empirical results are shown on density estimation and image generation up to 128x128 resolution.

Significance. If the central construction holds, the method supplies a computationally lighter alternative to both continuous normalizing flows and score-based diffusion models while retaining the ability to generate samples and evaluate likelihoods. Explicit credit is due for the empirical demonstration that the learned ODE flows match or exceed conventional CNF performance at substantially lower cost and scale to ImageNet 32x32 and 128x128 resolutions, as well as for the explicit link to optimal-transport path optimization.

major comments (2)
  1. [§3.2, Eq. (7)–(9)] §3.2, Eq. (7)–(9): the derivation that the quadratic loss recovers an unbiased estimator of the marginal probability current J_t relies on the conditional expectation E[ dX/dt | X_t ] equaling the velocity of the interpolant without residual bias from the stochastic term γ(t)dW. The manuscript does not explicitly verify that integrating the learned ODE (rather than the full SDE) reproduces the marginal p_t exactly when γ(t) > 0; a short proof or counter-example would be needed to confirm the claim that the ODE alone transports mass correctly.
  2. [§4.1] §4.1, the path-length objective: minimizing the expected path length of the interpolant is presented as a route to optimal transport maps, yet it is not shown that the resulting velocity remains consistent with the original quadratic loss or that the continuity equation is still satisfied after this additional optimization. If the two objectives conflict, the transport claim would require further justification.
minor comments (2)
  1. Notation for the interpolant parameters α(t), β(t), γ(t) is introduced without a consolidated table; a single reference table would improve readability.
  2. The CIFAR-10 and ImageNet experiments report FID and NLL but omit the number of function evaluations and wall-clock training time relative to the baselines, weakening the “fraction of the cost” claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their positive assessment of the significance of our work and for the detailed and constructive major comments. We address each point below with clarifications and commit to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses
  1. Referee: [§3.2, Eq. (7)–(9)] §3.2, Eq. (7)–(9): the derivation that the quadratic loss recovers an unbiased estimator of the marginal probability current J_t relies on the conditional expectation E[ dX/dt | X_t ] equaling the velocity of the interpolant without residual bias from the stochastic term γ(t)dW. The manuscript does not explicitly verify that integrating the learned ODE (rather than the full SDE) reproduces the marginal p_t exactly when γ(t) > 0; a short proof or counter-example would be needed to confirm the claim that the ODE alone transports mass correctly.

    Authors: We agree that an explicit verification would improve clarity. The velocity v_t is defined directly from the marginal probability current J_t of the interpolant density p_t via v_t = J_t / p_t, so that the continuity equation ∂_t p_t + ∇·(v_t p_t) = 0 holds by construction. Consequently, the ODE dx/dt = v_t(x,t) transports the marginals exactly, regardless of whether samples from p_t are generated by an underlying SDE. The quadratic loss regresses to this v_t; the conditional expectation E[dX/dt | X_t] recovers the required drift because the stochastic increment γ(t)dW has zero conditional mean. We will add a short appendix deriving the equivalence between the learned ODE and the marginal evolution (including the relation to the Fokker-Planck operator of the interpolant SDE) to confirm there is no residual bias. revision: yes

  2. Referee: [§4.1] §4.1, the path-length objective: minimizing the expected path length of the interpolant is presented as a route to optimal transport maps, yet it is not shown that the resulting velocity remains consistent with the original quadratic loss or that the continuity equation is still satisfied after this additional optimization. If the two objectives conflict, the transport claim would require further justification.

    Authors: The path-length objective is applied to the choice of interpolant (specifically, the schedule γ(t) and the form of the stochastic bridge), not as an additive penalty on the velocity itself. For any fixed interpolant the quadratic loss is minimized first, guaranteeing that the learned velocity satisfies the continuity equation for that interpolant’s marginal p_t. The path-length term then selects, among possible interpolants, the one whose induced flow is closer to an optimal-transport map. Because the velocity training procedure and the continuity equation remain unchanged, there is no conflict. We will insert a clarifying paragraph in §4.1 that separates the two stages of optimization and states that transport properties hold for whichever interpolant is chosen. revision: yes

Circularity Check

0 steps flagged

Derivation self-contained: quadratic loss follows directly from probability current of user-chosen interpolant

full rationale

The core construction defines the target velocity as the conditional expectation of the interpolant velocity given the marginal density at time t, then obtains the regression loss as the expectation of the squared difference between the network output and that velocity. This is an identity from the definition of the probability current and the continuity equation; it does not reduce any fitted parameter or prediction back to itself. The interpolant schedule is an external modeling choice supplied by the practitioner, not inferred from the same data that the flow is later evaluated on. No self-citation is invoked to justify uniqueness of the velocity or to close a definitional loop. The avoidance of ODE back-propagation is a direct algebraic consequence of the quadratic form, not a tautology.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the existence of a user-specified interpolating density whose probability current yields a velocity field that can be learned from samples; the specific schedule of the interpolant is a free design choice.

free parameters (1)
  • interpolant schedule
    The explicit time-dependent form of the interpolating density (e.g., linear or other mixing schedule) is chosen by the user and affects the resulting velocity field.
axioms (1)
  • domain assumption A velocity field exists that transports probability mass exactly according to the current of the chosen interpolant.
    Invoked when equating the learned velocity to the probability current of the interpolant density.

pith-pipeline@v0.9.0 · 5584 in / 1528 out tokens · 72426 ms · 2026-05-12T06:48:07.610362+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 37 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. What Time Is It? How Data Geometry Makes Time Conditioning Optional for Flow Matching

    cs.LG 2026-05 unverdicted novelty 8.0

    Data geometry makes time identifiable from noisy interpolants at rate O(1/sqrt(d-k)), rendering the time-blindness gap asymptotically negligible relative to coupling variance.

  2. Generative Modeling with Flux Matching

    cs.LG 2026-05 unverdicted novelty 8.0

    Flux Matching generalizes score-based generative modeling by using a weaker objective that admits infinitely many non-conservative vector fields with the data as stationary distribution, enabling new design choices be...

  3. ReConText3D: Replay-based Continual Text-to-3D Generation

    cs.CV 2026-04 conditional novelty 8.0

    ReConText3D is the first replay-memory framework for continual text-to-3D generation that prevents catastrophic forgetting on new textual categories while preserving quality on previously seen classes.

  4. Sampling from Flow Language Models via Marginal-Conditioned Bridges

    cs.LG 2026-05 unverdicted novelty 7.0

    Marginal-conditioned bridges enable training-free sampling from Flow Language Models by drawing clean one-hot endpoints from factorized posteriors and using Ornstein-Uhlenbeck bridges, preserving token marginals and r...

  5. One-Step Generative Modeling via Wasserstein Gradient Flows

    cs.LG 2026-05 conditional novelty 7.0

    W-Flow achieves state-of-the-art one-step ImageNet 256x256 generation at 1.29 FID by training a static neural network to follow a Wasserstein gradient flow that minimizes Sinkhorn divergence, delivering roughly 100x f...

  6. HorizonDrive: Self-Corrective Autoregressive World Model for Long-horizon Driving Simulation

    cs.CV 2026-05 conditional novelty 7.0

    HorizonDrive enables stable long-horizon autoregressive driving simulation via anti-drifting teacher training with scheduled rollout recovery and teacher rollout distillation.

  7. Zero-couplings of infinite measures with cyclically monotone support and multivariate regular variation

    math.PR 2026-05 unverdicted novelty 7.0

    Existence and uniqueness of cyclically monotone zero-couplings are established for arbitrary pairs of infinite measures in M_0(R^d) under a Hausdorff-dimension condition, with the tail limit of such couplings for regu...

  8. SDFlow: Similarity-Driven Flow Matching for Time Series Generation

    cs.AI 2026-05 unverdicted novelty 7.0

    SDFlow uses similarity-driven flow matching with low-rank manifold decomposition and a categorical posterior to generate high-fidelity long time series in VQ space without step-wise error accumulation.

  9. Flow Matching on Symmetric Spaces

    cs.LG 2026-05 unverdicted novelty 7.0

    A general framework reduces flow matching on symmetric spaces to flow matching on a Lie algebra subspace, linearizing geodesics.

  10. Generative Modeling with Orbit-Space Particle Flow Matching

    cs.GR 2026-05 unverdicted novelty 7.0

    OGPP is a particle flow-matching method using orbit-space canonicalization and geometric paths that achieves lower error and fewer steps than prior approaches on 3D benchmarks.

  11. Towards Efficient and Expressive Offline RL via Flow-Anchored Noise-conditioned Q-Learning

    cs.LG 2026-05 unverdicted novelty 7.0

    FAN achieves state-of-the-art offline RL performance on robotic tasks by anchoring flow policies and using single-sample noise-conditioned Q-learning, with proven convergence and reduced runtimes.

  12. Self-Improving Tabular Language Models via Iterative Group Alignment

    cs.LG 2026-04 unverdicted novelty 7.0

    TabGRAA enables self-improving tabular language models through iterative group-relative advantage alignment using modular automated quality signals like distinguishability classifiers.

  13. UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models

    cs.CV 2026-04 unverdicted novelty 7.0

    UniGeo unifies geometric guidance across three levels in video models to reduce geometric drift and improve consistency in camera-controllable image editing.

  14. Physically Grounded 3D Generative Reconstruction under Hand Occlusion using Proprioception and Multi-Contact Touch

    cs.CV 2026-04 unverdicted novelty 7.0

    A conditional diffusion model using proprioception and multi-contact touch produces metric-scale, physically consistent 3D object reconstructions under hand occlusion.

  15. Intervention-Based Time Series Causal Discovery via Simulator-Generated Interventional Distributions

    cs.LG 2026-05 unverdicted novelty 6.0

    SVAR-FM uses simulator clamping to produce interventional distributions and flow matching to identify time series causal structures, with an error bound that predicts sign reversal of causal effects below a simulator ...

  16. Discrete Flow Matching: Convergence Guarantees Under Minimal Assumptions

    cs.LG 2026-05 unverdicted novelty 6.0

    Discrete flow matching on Z_m^d achieves non-asymptotic KL bounds for early-stopped targets and explicit TV convergence to the true target under an approximation error assumption, with improved scaling in dimension d ...

  17. Conservative Flows: A New Paradigm of Generative Models

    cs.LG 2026-05 unverdicted novelty 6.0

    Conservative flows generate by running probability-preserving stochastic dynamics initialized at data points rather than noise, using corrected Langevin or predictor-corrector mechanisms on top of any pretrained flow ...

  18. SDFlow: Similarity-Driven Flow Matching for Time Series Generation

    cs.AI 2026-05 unverdicted novelty 6.0

    SDFlow learns a global transport map via similarity-driven flow matching in VQ latent space, using low-rank manifold decomposition and a categorical posterior to handle discreteness, yielding SOTA long-horizon perform...

  19. Free Energy Surface Sampling via Reduced Flow Matching

    cs.LG 2026-05 unverdicted novelty 6.0

    FES-FM applies reduced flow matching with a Hessian-derived prior to directly sample free energy surfaces in collective variable space, claiming lower computational cost and higher accuracy per unit time than standard...

  20. Quantum Dynamics via Score Matching on Bohmian Trajectories

    quant-ph 2026-04 unverdicted novelty 6.0

    Neural networks learn the score of the probability density on Bohmian trajectories to recover exact Schrödinger dynamics via self-consistent minimization for nodeless wave functions, demonstrated on double-well splitt...

  21. Allo{SR}$^2$: Rectifying One-Step Super-Resolution to Stay Real via Allomorphic Generative Flows

    cs.CV 2026-04 unverdicted novelty 6.0

    Allo{SR}^2 rectifies one-step super-resolution trajectories with allomorphic generative flows via SNR initialization, velocity supervision, and self-adversarial matching to deliver state-of-the-art fidelity and realism.

  22. Fisher Decorator: Refining Flow Policy via a Local Transport Map

    cs.LG 2026-04 unverdicted novelty 6.0

    Fisher Decorator refines flow policies in offline RL via a local transport map and Fisher-matrix quadratic approximation of the KL constraint, yielding controllable error near the optimum and SOTA benchmark results.

  23. UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models

    cs.CV 2026-04 unverdicted novelty 6.0

    UniGeo adds unified geometric guidance at three levels in video models to reduce geometric drift and improve structural fidelity in camera-controllable image editing.

  24. Frequency-Aware Flow Matching for High-Quality Image Generation

    cs.CV 2026-04 unverdicted novelty 6.0

    FreqFlow introduces frequency-aware conditioning and a two-branch architecture to flow matching, reaching FID 1.38 on ImageNet-256 and outperforming DiT and SiT.

  25. LiveMoments: Reselected Key Photo Restoration in Live Photos via Reference-guided Diffusion

    cs.CV 2026-04 unverdicted novelty 6.0

    LiveMoments restores reselected key photos in Live Photos via reference-guided diffusion and motion alignment, yielding higher perceptual quality and fidelity than prior methods especially under fast motion.

  26. Region-Constrained Group Relative Policy Optimization for Flow-Based Image Editing

    cs.CV 2026-04 unverdicted novelty 6.0

    RC-GRPO-Editing constrains GRPO exploration to editing regions via localized noise and attention rewards, improving instruction adherence and non-target preservation in flow-based image editing.

  27. Monte Carlo Event Generation with Continuous Normalizing Flows

    hep-ph 2026-04 conditional novelty 6.0

    Continuous normalizing flows improve unweighting efficiency in Monte Carlo event generation for high-jet-multiplicity collider processes by factors up to 184, with wall-time gains of about ten when combined with coupl...

  28. Mean Flows for One-step Generative Modeling

    cs.LG 2025-05 unverdicted novelty 6.0

    MeanFlow uses a derived identity between average and instantaneous velocities to train one-step flow models, achieving FID 3.43 on ImageNet 256x256 with 1-NFE from scratch.

  29. MAGI-1: Autoregressive Video Generation at Scale

    cs.CV 2025-05 unverdicted novelty 6.0

    MAGI-1 is a 24B-parameter autoregressive video world model that predicts denoised frame chunks sequentially with increasing noise to enable causal, scalable, streaming generation up to 4M token contexts.

  30. DanceGRPO: Unleashing GRPO on Visual Generation

    cs.CV 2025-05 unverdicted novelty 6.0

    DanceGRPO applies GRPO to visual generation tasks to achieve stable policy optimization across diffusion models, rectified flows, multiple tasks, and diverse reward models, outperforming prior RL methods.

  31. CaloArt: Large-Patch x-Prediction Diffusion Transformers for High-Granularity Calorimeter Shower Generation

    physics.ins-det 2026-05 unverdicted novelty 5.0

    CaloArt achieves top FPD, high-level, and classifier metrics on CaloChallenge datasets 2 and 3 while keeping single-GPU generation at 9-11 ms per shower by combining large-patch tokenization, x-prediction, and conditi...

  32. Neural Posterior Estimation of Terrain Parameters from Radar Sounder Data

    eess.SP 2026-05 unverdicted novelty 5.0

    Neural posterior estimation trained on simulated radar data enables probabilistic inference of terrain parameters from real Mars radar sounder profiles while conditioning on reference surface assumptions.

  33. SubFlow: Sub-mode Conditioned Flow Matching for Diverse One-Step Generation

    cs.LG 2026-04 unverdicted novelty 5.0

    SubFlow restores full mode coverage in one-step flow matching by conditioning on sub-modes from semantic clustering, yielding higher diversity on ImageNet-256 while preserving FID.

  34. Efficient Hierarchical Implicit Flow Q-learning for Offline Goal-conditioned Reinforcement Learning

    cs.LG 2026-04 unverdicted novelty 5.0

    Proposes mean flow policies and LeJEPA loss to overcome Gaussian policy limits and weak subgoal generation in hierarchical offline GCRL, reporting strong results on OGBench state and pixel tasks.

  35. Exploring Motion-Language Alignment for Text-driven Motion Generation

    cs.CV 2026-04 unverdicted novelty 5.0

    MLA-Gen advances text-driven motion synthesis by aligning global motion patterns with fine-grained text semantics and mitigating attention sink effects via new masking techniques.

  36. An AI-based Detector Simulation and Reconstruction Model for the ALEPH Experiment at LEP

    physics.ins-det 2026-04 unverdicted novelty 4.0

    Parnassus faithfully reproduces the ALEPH detector response at event, jet, and particle levels for clean e+e- to Z to qqbar events.

  37. Flow Matching Guide and Code

    cs.LG 2024-12 unverdicted novelty 2.0

    Flow Matching is a generative modeling framework with mathematical foundations, design choices, extensions, and open-source PyTorch code for applications like image and text generation.

Reference graph

Works this paper leans on

117 extracted references · 117 canonical work pages · cited by 35 Pith papers · 8 internal anchors

  1. [1]

    Li, Hamid Kazemi, Furong Huang, Micah Goldblum, Jonas Geiping, and Tom Goldstein

    Arpit Bansal, Eitan Borgnia, Hong-Min Chu, Jie S. Li, Hamid Kazemi, Furong Huang, Micah Goldblum, Jonas Geiping, and Tom Goldstein. Cold diffusion: Inverting arbitrary image transforms without noise, 2022. URL https://arxiv.org/abs/2208.09392

  2. [2]

    Heli Ben - Hamu, Samuel Cohen, Joey Bose, Brandon Amos, Maximilian Nickel, Aditya Grover, Ricky T. Q. Chen, and Yaron Lipman. Matching normalizing flows and probability paths on manifolds. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesv \' a ri, Gang Niu, and Sivan Sabato (eds.), International Conference on Machine Learning, ICML 2022, 17-...

  3. [3]

    A computational fluid mechanics solution to the monge-kantorovich mass transfer problem

    Jean-David Benamou and Yann Brenier. A computational fluid mechanics solution to the monge-kantorovich mass transfer problem. Numerische Mathematik, 84 0 (3): 0 375--393, 2000

  4. [4]

    Boffi and Eric Vanden-Eijnden

    Nicholas M. Boffi and Eric Vanden-Eijnden. Probability flow solution of the fokker-planck equation, 2022. URL https://arxiv.org/abs/2206.04642

  5. [5]

    Large scale GAN training for high fidelity natural image synthesis

    Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale GAN training for high fidelity natural image synthesis. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=B1xsqj09Fm

  6. [6]

    Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. Neural ordinary differential equations. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper/2018/file/69386...

  7. [7]

    Gaussianization

    Scott Chen and Ramesh Gopinath. Gaussianization. In T. Leen, T. Dietterich, and V. Tresp (eds.), Advances in Neural Information Processing Systems, volume 13. MIT Press, 2000. URL https://proceedings.neurips.cc/paper/2000/file/3c947bc2f7ff007b86a9428b74654de5-Paper.pdf

  8. [8]

    Density ratio estimation via infinitesimal classification

    Kristy Choi, Chenlin Meng, Yang Song, and Stefano Ermon. Density ratio estimation via infinitesimal classification. In Gustau Camps - Valls, Francisco J. R. Ruiz, and Isabel Valera (eds.), International Conference on Artificial Intelligence and Statistics, AISTATS 2022, 28-30 March 2022, Virtual Event , volume 151 of Proceedings of Machine Learning Resear...

  9. [9]

    Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)

    Djork - Arn \' e Clevert, Thomas Unterthiner, and Sepp Hochreiter. Fast and accurate deep network learning by exponential linear units (elus). In Yoshua Bengio and Yann LeCun (eds.), 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings , 2016. URL http://arxiv.org/abs/1511.07289

  10. [10]

    In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.\ 248--255, 2009. doi:10.1109/CVPR.2009.5206848

  11. [11]

    Density estimation using real NVP

    Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real NVP . In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=HkpbnH9lx

  12. [13]

    Neural spline flows

    Conor Durkan, Artur Bekasov, Iain Murray, and George Papamakarios. Neural spline flows. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d Alch\' e -Buc, E. Fox, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper/2019/file/7ac71d433f282034e088473244df...

  13. [14]

    How to train your neural ODE : the world of J acobian and kinetic regularization

    Chris Finlay, Joern-Henrik Jacobsen, Levon Nurbekyan, and Adam Oberman. How to train your neural ODE : the world of J acobian and kinetic regularization. In Hal Daumé III and Aarti Singh (eds.), Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pp.\ 3154--3164. PMLR, 13--18 Jul 20...

  14. [15]

    Made: Masked autoencoder for distribution estimation

    Mathieu Germain, Karol Gregor, Iain Murray, and Hugo Larochelle. Made: Masked autoencoder for distribution estimation. In Francis Bach and David Blei (eds.), Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pp.\ 881--889, Lille, France, 07--09 Jul 2015. PMLR. URL https://proceedin...

  15. [16]

    Generative adversarial nets

    Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pp.\ 2672--2680, 2014

  16. [17]

    Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, and David Duvenaud. Scalable reversible generative models with free-form continuous dynamics. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=rJxgknCcK7

  17. [18]

    Simulating diffusion bridges with score matching, 2021

    Jeremy Heng, Valentin De Bortoli, Arnaud Doucet, and James Thornton. Simulating diffusion bridges with score matching, 2021. URL https://arxiv.org/abs/2111.07243

  18. [19]

    Denoising diffusion probabilistic models

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp.\ 6840--6851. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf

  19. [20]

    Equivariant diffusion for molecule generation in 3 D

    Emiel Hoogeboom, V\' ctor Garcia Satorras, Cl \'e ment Vignac, and Max Welling. Equivariant diffusion for molecule generation in 3 D . In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (eds.), Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Res...

  20. [21]

    Neural autoregressive flows

    Chin-Wei Huang, David Krueger, Alexandre Lacoste, and Aaron Courville. Neural autoregressive flows. In Jennifer Dy and Andreas Krause (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp.\ 2078--2087. PMLR, 10--15 Jul 2018. URL https://proceedings.mlr.press/v80/huang18d.html

  21. [22]

    Chin-Wei Huang, Ricky T. Q. Chen, Christos Tsirigotis, and Aaron Courville. Convex potential flows: Universal probability distributions with optimal transport and convex optimization. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=te7PVH1sPxJ

  22. [24]

    Soft truncation: A universal training technique of score-based diffusion model for high precision score estimation

    Dongjun Kim, Seungjae Shin, Kyungwoo Song, Wanmo Kang, and Il-Chul Moon. Soft truncation: A universal training technique of score-based diffusion model for high precision score estimation. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (eds.), Proceedings of the 39th International Conference on Machine Learn...

  23. [25]

    Adam: A method for stochastic optimization

    Diederick P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), 2015

  24. [26]

    On density estimation with diffusion models

    Diederik P Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. On density estimation with diffusion models. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=2LdBqxc1Yv

  25. [27]

    Glow: Generative flow with invertible 1x1 convolutions

    Durk P Kingma and Prafulla Dhariwal. Glow: Generative flow with invertible 1x1 convolutions. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper/2018/file/d139db6a236200b21cc7f752979...

  26. [28]

    Cifar-10 (canadian institute for advanced research)

    Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. Cifar-10 (canadian institute for advanced research). 2009. URL http://www.cs.toronto.edu/ kriz/cifar.html

  27. [30]

    Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling, 2022. URL https://arxiv.org/abs/2210.02747

  28. [31]

    Rectified flow: A marginal preserving approach to o ptimal transport

    Qiang Liu. Rectified flow: A marginal preserving approach to optimal transport, 2022. URL https://arxiv.org/abs/2209.14577

  29. [32]

    Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

    Xingchao Liu, Chengyue Gong, and Qiang Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow, 2022. URL https://arxiv.org/abs/2209.03003

  30. [33]

    Maximum likelihood training for score-based diffusion ODE s by high order denoising score matching

    Cheng Lu, Kaiwen Zheng, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Maximum likelihood training for score-based diffusion ODE s by high order denoising score matching. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (eds.), Proceedings of the 39th International Conference on Machine Learning, volume 162...

  31. [34]

    GENERATING HIGH FIDELITY IMAGES WITH SUBSCALE PIXEL NETWORKS AND MULTIDIMENSIONAL UPSCALING

    Jacob Menick and Nal Kalchbrenner. GENERATING HIGH FIDELITY IMAGES WITH SUBSCALE PIXEL NETWORKS AND MULTIDIMENSIONAL UPSCALING . In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=HylzTiC5Km

  32. [35]

    On the lipschitz properties of transportation along heat flows, 2022

    Dan Mikulincer and Yair Shenfeld. On the lipschitz properties of transportation along heat flows, 2022. URL https://arxiv.org/abs/2201.01382

  33. [36]

    Vinod Nair and Geoffrey E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML'10, pp.\ 807–814, Madison, WI, USA, 2010. Omnipress. ISBN 9781605589077

  34. [37]

    Action matching: A variational method for learning stochastic dynamics from samples, 2022

    Kirill Neklyudov, Daniel Severo, and Alireza Makhzani. Action matching: A variational method for learning stochastic dynamics from samples, 2022. URL https://arxiv.org/abs/2210.06662

  35. [38]

    Improved denoising diffusion probabilistic models

    Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. In Marina Meila and Tong Zhang (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp.\ 8162--8171. PMLR, 18--24 Jul 2021. URL https://proceedings.mlr.press/v139/nichol21a.html

  36. [39]

    A visual vocabulary for flower classification

    Maria-Elena Nilsback and Andrew Zisserman. A visual vocabulary for flower classification. In IEEE Conference on Computer Vision and Pattern Recognition, volume 2, pp.\ 1447--1454, 2006

  37. [41]

    Masked autoregressive flow for density estimation

    George Papamakarios, Theo Pavlakou, and Iain Murray. Masked autoregressive flow for density estimation. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS'17, pp.\ 2335–2344, Red Hook, NY, USA, 2017. Curran Associates Inc. ISBN 9781510860964

  38. [42]

    Non-denoising forward-time diffusions, 2022

    Stefano Peluchetti. Non-denoising forward-time diffusions, 2022. URL https://openreview.net/forum?id=oVfIKuhqfC

  39. [43]

    Hierarchical Text-Conditional Image Generation with CLIP Latents

    Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text-conditional image generation with clip latents, 2022. URL https://arxiv.org/abs/2204.06125

  40. [44]

    Generating diverse high-fidelity images with vq-vae-2

    Ali Razavi, Aaron van den Oord, and Oriol Vinyals. Generating diverse high-fidelity images with vq-vae-2. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d Alch\' e -Buc, E. Fox, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper/2019/file/5f8e2fa171...

  41. [45]

    Variational inference with normalizing flows

    Danilo Rezende and Shakir Mohamed. Variational inference with normalizing flows. In Francis Bach and David Blei (eds.), Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pp.\ 1530--1538, Lille, France, 07--09 Jul 2015. PMLR. URL https://proceedings.mlr.press/v37/rezende15.html

  42. [46]

    High-resolution image synthesis with latent diffusion models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj\"orn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 10684--10695, June 2022

  43. [47]

    Moser flow: Divergence-based generative modeling on manifolds

    Noam Rozen, Aditya Grover, Maximilian Nickel, and Yaron Lipman. Moser flow: Divergence-based generative modeling on manifolds. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=qGvMv3undNJ

  44. [48]

    Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

    Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, and Mohammad Norouzi. Photorealistic text-to-image diffusion models with deep language understanding, 2022. URL https://arxiv.org/abs/2205.11487

  45. [49]

    Optimal transport for applied mathematicians

    Filippo Santambrogio. Optimal transport for applied mathematicians. Birk \"a user, NY , 55 0 (58-63): 0 94, 2015

  46. [50]

    Deep unsupervised learning using nonequilibrium thermodynamics

    Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. In Francis Bach and David Blei (eds.), Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pp.\ 2256--2265, Lille, France, 07--09 Jul 2015. PMLR...

  47. [51]

    Maximum likelihood training of score-based diffusion models

    Yang Song, Conor Durkan, Iain Murray, and Stefano Ermon. Maximum likelihood training of score-based diffusion models. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems, volume 34, pp.\ 1415--1428. Curran Associates, Inc., 2021 a . URL https://proceedings.neurips.cc/paper...

  48. [52]

    Score-based generative modeling through stochastic differential equations

    Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021 b . URL https://openreview.net/forum?id=PxTIG12RRHS

  49. [54]

    Tabak and Eric Vanden-Eijnden

    Esteban G. Tabak and Eric Vanden-Eijnden. Density estimation by dual ascent of the log-likelihood . Communications in Mathematical Sciences, 8 0 (1): 0 217 -- 233, 2010. doi:cms/1266935020. URL https://doi.org/

  50. [55]

    Pixel recurrent neural networks

    A\" a ron Van Den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. Pixel recurrent neural networks. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML'16, pp.\ 1747–1756. JMLR.org, 2016

  51. [56]

    Optimal transport: old and new, volume 338

    C \'e dric Villani. Optimal transport: old and new, volume 338. Springer, 2009

  52. [57]

    Tackling the generative learning trilemma with denoising diffusion GAN s

    Zhisheng Xiao, Karsten Kreis, and Arash Vahdat. Tackling the generative learning trilemma with denoising diffusion GAN s. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=JprM0p-q0Co

  53. [58]

    Simulating diffusion bridges with score matching, 2021

    Heng, Jeremy and De Bortoli, Valentin and Doucet, Arnaud and Thornton, James , keywords =. Simulating Diffusion Bridges with Score Matching , publisher =. 2021 , copyright =. doi:10.48550/ARXIV.2111.07243 , url =

  54. [59]

    2022 , url=

    Non-Denoising Forward-Time Diffusions , author=. 2022 , url=

  55. [60]

    On the lipschitz properties of transportation along heat flows, 2022

    Mikulincer, Dan and Shenfeld, Yair , keywords =. On the Lipschitz properties of transportation along heat flows , publisher =. 2022 , copyright =. doi:10.48550/ARXIV.2201.01382 , url =

  56. [61]

    Action matching: A variational method for learning stochastic dynamics from samples, 2022

    Neklyudov, Kirill and Severo, Daniel and Makhzani, Alireza , keywords =. Action Matching: A Variational Method for Learning Stochastic Dynamics from Samples , publisher =. 2022 , copyright =. doi:10.48550/ARXIV.2210.06662 , url =

  57. [62]

    Optimal transport for applied mathematicians , author=. Birk. 2015 , publisher=

  58. [63]

    2009 , publisher=

    Optimal transport: old and new , author=. 2009 , publisher=

  59. [64]

    Numerische Mathematik , volume=

    A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem , author=. Numerische Mathematik , volume=. 2000 , publisher=

  60. [65]

    Advances in neural information processing systems , pages=

    Generative adversarial nets , author=. Advances in neural information processing systems , pages=

  61. [66]

    Generative Modeling by Estimating Gradients of the Data Distribution , url =

    Song, Yang and Ermon, Stefano , booktitle =. Generative Modeling by Estimating Gradients of the Data Distribution , url =

  62. [67]

    Density estimation using Real

    Laurent Dinh and Jascha Sohl-Dickstein and Samy Bengio , booktitle=. Density estimation using Real. 2017 , url=

  63. [68]

    Denoising Diffusion Probabilistic Models , url =

    Ho, Jonathan and Jain, Ajay and Abbeel, Pieter , booktitle =. Denoising Diffusion Probabilistic Models , url =

  64. [69]

    International Conference on Learning Representations , year=

    Score-Based Generative Modeling through Stochastic Differential Equations , author=. International Conference on Learning Representations , year=

  65. [70]

    Proceedings of the 35th International Conference on Machine Learning , pages =

    Neural Autoregressive Flows , author =. Proceedings of the 35th International Conference on Machine Learning , pages =. 2018 , editor =

  66. [71]

    International Conference on Learning Representations , year=

    Scalable Reversible Generative Models with Free-form Continuous Dynamics , author=. International Conference on Learning Representations , year=

  67. [72]

    1949 , publisher=

    On the Theory of Stochastic Processes, with Particular Reference to Applications , author=. 1949 , publisher=

  68. [73]

    Proceedings of the 32nd International Conference on Machine Learning , pages =

    Deep Unsupervised Learning using Nonequilibrium Thermodynamics , author =. Proceedings of the 32nd International Conference on Machine Learning , pages =. 2015 , editor =

  69. [74]

    Friedman , journal =

    Jerome H. Friedman , journal =. Exploratory Projection Pursuit , urldate =

  70. [75]

    Gaussianization , url =

    Chen, Scott and Gopinath, Ramesh , booktitle =. Gaussianization , url =

  71. [76]

    Tabak and Eric Vanden-Eijnden , title =

    Esteban G. Tabak and Eric Vanden-Eijnden , title =. Communications in Mathematical Sciences , number =. 2010 , doi =

  72. [77]

    Tabak, E. G. and Turner, Cristina V. , title =. Communications on Pure and Applied Mathematics , volume =. doi:https://doi.org/10.1002/cpa.21423 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpa.21423 , abstract =

  73. [78]

    Proceedings of the 32nd International Conference on Machine Learning , pages =

    Variational Inference with Normalizing Flows , author =. Proceedings of the 32nd International Conference on Machine Learning , pages =. 2015 , editor =

  74. [79]

    Hutchinson

    M.F. Hutchinson , title =. Communications in Statistics - Simulation and Computation , volume =. 1989 , publisher =. doi:10.1080/03610918908812806 , URL =

  75. [80]

    Maximum Likelihood Training of Score-Based Diffusion Models , url =

    Song, Yang and Durkan, Conor and Murray, Iain and Ermon, Stefano , booktitle =. Maximum Likelihood Training of Score-Based Diffusion Models , url =

  76. [81]

    Li, Hamid Kazemi, Furong Huang, Micah Goldblum, Jonas Geiping, and Tom Goldstein

    Bansal, Arpit and Borgnia, Eitan and Chu, Hong-Min and Li, Jie S. and Kazemi, Hamid and Huang, Furong and Goldblum, Micah and Geiping, Jonas and Goldstein, Tom , keywords =. Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise , publisher =. 2022 , copyright =. doi:10.48550/ARXIV.2208.09392 , url =

  77. [82]

    Maximum Likelihood Training for Score-based Diffusion

    Lu, Cheng and Zheng, Kaiwen and Bao, Fan and Chen, Jianfei and Li, Chongxuan and Zhu, Jun , booktitle =. Maximum Likelihood Training for Score-based Diffusion. 2022 , editor =

  78. [83]

    Tackling the Generative Learning Trilemma with Denoising Diffusion

    Zhisheng Xiao and Karsten Kreis and Arash Vahdat , booktitle=. Tackling the Generative Learning Trilemma with Denoising Diffusion. 2022 , url=

  79. [84]

    International Conference on Learning Representations (ICLR) , year =

    Kingma, Diederick P and Ba, Jimmy , title =. International Conference on Learning Representations (ICLR) , year =

  80. [85]

    Proceedings of the 31st International Conference on Neural Information Processing Systems , pages =

    Papamakarios, George and Pavlakou, Theo and Murray, Iain , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems , pages =. 2017 , isbn =

Showing first 80 references.