arxiv: 2209.03003 · v1 · submitted 2022-09-07 · 💻 cs.LG

Recognition: 3 theorem links

· Lean Theorem

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

Chengyue Gong, Qiang Liu, Xingchao Liu

Pith reviewed 2026-05-10 13:12 UTC · model grok-4.3

classification 💻 cs.LG

keywords rectified flowneural ODEgenerative modelingdomain transferstraight pathsimage generationdata transport

0 comments

The pith

Rectified flow learns ODEs that follow straight paths between data distributions by solving a simple least-squares problem.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes rectified flow, a method to learn neural ODEs for transporting between two observed distributions π₀ and π₁. It does this by training the velocity field to match straight-line paths as closely as possible through nonlinear least squares. This turns arbitrary couplings into deterministic ones with decreasing convex transport costs, and repeated rectification yields straighter paths. These straight paths allow accurate simulation using coarse time steps like a single Euler step, making the models efficient for tasks like image generation and domain transfer.

Core claim

By learning a velocity field that minimizes the squared difference from straight-line interpolations between paired samples from the two distributions, the rectification process produces a deterministic coupling with provably lower or equal convex transport costs. Recursively applying this yields a sequence of flows with progressively straighter trajectories that can be integrated accurately without fine discretization.

What carries the argument

The rectification procedure, which refines a coupling of π₀ and π₁ into one where the learned ODE follows straight paths, reducing transport costs and enabling coarse simulation.

If this is right

Straight paths can be simulated exactly without discretization error.
Recursive rectification increases path straightness for better efficiency.
High-quality image generation and translation achievable with one Euler step.
Unified approach for generative modeling and domain adaptation tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This could simplify training of other continuous normalizing flows by encouraging straight trajectories.
Applications might extend to sequential data or non-image domains where fast sampling is critical.
Connections to optimal transport suggest potential for lower cost solutions in distribution matching.

Load-bearing premise

The nonlinear least-squares optimization reliably finds a velocity field that closely approximates the straight paths without suffering from scalability issues or optimization failures.

What would settle it

If after one or more rectifications the learned flow paths remain significantly curved or if single-step Euler integration yields poor sample quality on image tasks, the claim of increasingly straight and efficient flows would be falsified.

read the original abstract

We present rectified flow, a surprisingly simple approach to learning (neural) ordinary differential equation (ODE) models to transport between two empirically observed distributions \pi_0 and \pi_1, hence providing a unified solution to generative modeling and domain transfer, among various other tasks involving distribution transport. The idea of rectified flow is to learn the ODE to follow the straight paths connecting the points drawn from \pi_0 and \pi_1 as much as possible. This is achieved by solving a straightforward nonlinear least squares optimization problem, which can be easily scaled to large models without introducing extra parameters beyond standard supervised learning. The straight paths are special and preferred because they are the shortest paths between two points, and can be simulated exactly without time discretization and hence yield computationally efficient models. We show that the procedure of learning a rectified flow from data, called rectification, turns an arbitrary coupling of \pi_0 and \pi_1 to a new deterministic coupling with provably non-increasing convex transport costs. In addition, recursively applying rectification allows us to obtain a sequence of flows with increasingly straight paths, which can be simulated accurately with coarse time discretization in the inference phase. In empirical studies, we show that rectified flow performs superbly on image generation, image-to-image translation, and domain adaptation. In particular, on image generation and translation, our method yields nearly straight flows that give high quality results even with a single Euler discretization step.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Rectified flows provide a simple way to learn straight trajectories for fast inference, though the theory applies strictly to the exact least-squares solution.

read the letter

Rectified flow straightens arbitrary couplings between two distributions by solving a nonlinear least squares problem that fits a velocity field to linear interpolants. This yields deterministic paths with provably non-increasing convex transport costs, and recursion makes them even straighter for cheap simulation. The novelty lies in this explicit rectification step and the recursive procedure, which the abstract positions as new compared to prior ODE and diffusion work. It unifies generative modeling and domain transfer under one framework without extra parameters. The paper does well by keeping the method simple and scalable, like standard supervised learning. The empirical results on image generation, translation, and adaptation are strong, particularly the claim that high quality holds with just one Euler step, which would be a real efficiency gain. The soft spot is the gap between theory and practice. The non-increasing cost and straightness are shown for the exact minimizer of the least squares objective. But the implementation uses a neural network on samples, so approximation and optimization errors could break those properties. The stress-test note is right that without bounds on how good the approximation needs to be, the practical advantages rest on empirics alone rather than guaranteed behavior. Minor issues like that aside, the central argument seems to hold up in the reported experiments. This work is for people building efficient transport-based models, especially in vision. A reader looking for alternatives to slow sampling in diffusion models would find it useful. I recommend sending it to peer review. It has enough substance and evidence to warrant detailed feedback on the theory and further validation.

Referee Report

2 major / 2 minor

Summary. The paper proposes rectified flow, a simple method to learn neural ODEs transporting between empirical distributions π₀ and π₁ by solving a nonlinear least-squares problem that encourages the velocity field to follow straight-line paths between paired samples. Rectification converts an arbitrary coupling into a deterministic one with provably non-increasing convex transport costs; recursive rectification produces successively straighter flows that can be simulated accurately with coarse (even single-step) Euler discretization. The approach unifies generative modeling and domain transfer and is shown empirically to perform well on image generation, image-to-image translation, and domain adaptation.

Significance. If the learned neural flows approximately inherit the exact-case straightness and cost-reduction properties, the method supplies a parameter-efficient, unified framework for flow-based transport that reduces inference cost via coarse discretization while maintaining competitive sample quality. The empirical results on high-dimensional vision tasks indicate practical promise, and the absence of extra architectural parameters beyond standard supervised learning is a notable engineering strength.

major comments (2)

[§3] §3 (Rectification theorem): The non-increasing convex transport-cost guarantee and the straight-path property are derived for the exact population minimizer of E[‖v((1−t)X₀ + t X₁, t) − (X₁ − X₀)‖²]. The manuscript instead optimizes a neural-network approximator on finite samples; no quantitative bounds on optimization or approximation error are supplied to ensure the induced endpoint map still satisfies the cost inequality or remains sufficiently close to linear interpolants for the single-step Euler claim to be reliable. This gap is load-bearing for the central practical advantage.
[§4] §4 (Empirical validation): The reported single-step generation and translation results are strong, yet the manuscript provides no direct diagnostic (e.g., average deviation from straight lines, measured curvature, or empirical verification of the transport-cost inequality on held-out pairs) that would confirm the performance originates from the rectification properties rather than model capacity or training heuristics.

minor comments (2)

[§2–3] The notation for couplings and the precise statement of the population versus empirical objective could be clarified in the main text to make the transition from theory to implementation more transparent.
[§4] Figure captions and axis labels in the experimental section would benefit from explicit mention of the number of function evaluations used for each baseline to facilitate direct comparison of discretization efficiency.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and positive assessment of rectified flow's potential as a unified framework. We address each major comment point by point below, acknowledging where the manuscript can be strengthened through clarification and added analysis.

read point-by-point responses

Referee: [§3] §3 (Rectification theorem): The non-increasing convex transport-cost guarantee and the straight-path property are derived for the exact population minimizer of E[‖v((1−t)X₀ + t X₁, t) − (X₁ − X₀)‖²]. The manuscript instead optimizes a neural-network approximator on finite samples; no quantitative bounds on optimization or approximation error are supplied to ensure the induced endpoint map still satisfies the cost inequality or remains sufficiently close to linear interpolants for the single-step Euler claim to be reliable. This gap is load-bearing for the central practical advantage.

Authors: We agree that the rectification theorem, including the non-increasing convex transport cost and straight-path properties, is stated for the exact population minimizer of the least-squares objective. The manuscript optimizes a neural-network parametrization of the velocity field on finite samples and does not supply quantitative bounds on optimization or approximation error. This is a genuine gap between the exact-case analysis and the practical implementation. At the same time, the training objective is explicitly designed to minimize deviation from straight-line paths, and the empirical results on image tasks demonstrate that single-step Euler integration yields competitive sample quality. In revision we will add a dedicated paragraph in §3 discussing the approximation gap, its implications for the guarantees, and related literature on neural approximations to optimal transport maps. revision: yes
Referee: [§4] §4 (Empirical validation): The reported single-step generation and translation results are strong, yet the manuscript provides no direct diagnostic (e.g., average deviation from straight lines, measured curvature, or empirical verification of the transport-cost inequality on held-out pairs) that would confirm the performance originates from the rectification properties rather than model capacity or training heuristics.

Authors: The referee is correct that the current version lacks direct quantitative diagnostics linking performance to the rectification mechanism. While we report strong single-step results and note that paths become straighter under recursive rectification, we do not include metrics such as average path deviation, curvature, or held-out transport-cost comparisons. In the revised manuscript we will add these diagnostics: for example, plots of average ||v(x,t) − (x₁ − x₀)|| over t on held-out pairs, empirical verification of the transport-cost inequality before and after rectification, and curvature measures. These additions will help isolate the contribution of the learned straightness from model capacity. revision: yes

Circularity Check

0 steps flagged

Central claims derive from exact least-squares properties of rectification; neural approximation introduces no definitional circularity.

full rationale

The paper defines rectification as solving the nonlinear least-squares problem min E[||v((1-t)X0 + t X1, t) - (X1 - X0)||^2] over couplings of π0 and π1. Mathematical properties (non-increasing convex transport cost, straighter paths under recursion) are direct consequences of the exact minimizer being the straight-line flow; these follow from standard optimal transport identities rather than self-reference or renaming. The practical implementation uses a neural network trained on finite samples, so the 'provably' statements hold only for the population minimizer. This creates a validity gap between theory and implementation but does not constitute circularity: no step renames a fitted quantity as an independent prediction, no self-citation is load-bearing for the core inequality, and no ansatz is smuggled. The derivation chain remains self-contained once the exact-case math is granted.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on standard existence and uniqueness results for ODEs and on the convexity of certain transport costs; no new entities are postulated and no free parameters are introduced beyond those already present in a standard neural-network velocity field.

axioms (1)

standard math The learned velocity field admits unique solutions to the ODE initial-value problem for almost all initial conditions.
Required for the transport map to be well-defined and for the straight-path simulation claim to hold.

pith-pipeline@v0.9.0 · 5560 in / 1177 out tokens · 36520 ms · 2026-05-10T13:12:26.273478+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith.Cost.FunctionalEquation washburn_uniqueness_aczel echoes
the procedure of learning a rectified flow from data, called rectification, turns an arbitrary coupling of π0 and π1 to a new deterministic coupling with provably non-increasing convex transport costs
IndisputableMonolith.Foundation.DAlembert.Inevitability bilinear_family_forced echoes
recursively applying rectification allows us to obtain a sequence of flows with increasingly straight paths, which can be simulated accurately with coarse time discretization

Forward citations

Cited by 60 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

What Time Is It? How Data Geometry Makes Time Conditioning Optional for Flow Matching
cs.LG 2026-05 unverdicted novelty 8.0

Data geometry makes time identifiable from noisy interpolants at rate O(1/sqrt(d-k)), rendering the time-blindness gap asymptotically negligible relative to coupling variance.
Generative Modeling with Flux Matching
cs.LG 2026-05 unverdicted novelty 8.0

Flux Matching generalizes score-based generative modeling by using a weaker objective that admits infinitely many non-conservative vector fields with the data as stationary distribution, enabling new design choices be...
Divergence is Uncertainty: A Closed-Form Posterior Covariance for Flow Matching
cs.LG 2026-05 unverdicted novelty 8.0

In flow matching, the uncertainty of the clean data given the current state is exactly the divergence of the velocity field (up to a known scalar).
ReConText3D: Replay-based Continual Text-to-3D Generation
cs.CV 2026-04 conditional novelty 8.0

ReConText3D is the first replay-memory framework for continual text-to-3D generation that prevents catastrophic forgetting on new textual categories while preserving quality on previously seen classes.
OP-GRPO: Efficient Off-Policy GRPO for Flow-Matching Models
cs.CV 2026-04 unverdicted novelty 8.0

OP-GRPO is the first off-policy GRPO method for flow-matching models that reuses trajectories via replay buffer and importance sampling corrections, matching on-policy performance with 34.2% of the training steps.
Flow-GRPO: Training Flow Matching Models via Online RL
cs.CV 2025-05 unverdicted novelty 8.0

Flow-GRPO is the first online RL method for flow matching models, raising GenEval accuracy from 63% to 95% and text-rendering accuracy from 59% to 92% with little reward hacking.
Consistency Models
cs.LG 2023-03 conditional novelty 8.0

Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.
Building Normalizing Flows with Stochastic Interpolants
cs.LG 2022-09 conditional novelty 8.0

Normalizing flows are constructed by learning the velocity of a stochastic interpolant via a quadratic loss derived from its probability current, yielding an efficient ODE-based alternative to diffusion models.
Aligning Flow Map Policies with Optimal Q-Guidance
cs.LG 2026-05 unverdicted novelty 7.0

Flow map policies enable fast one-step inference for flow-based RL policies, and FMQ provides an optimal closed-form Q-guided target for offline-to-online adaptation under trust-region constraints, achieving SOTA performance.
Expected Batch Optimal Transport Plans and Consequences for Flow Matching
cs.LG 2026-05 unverdicted novelty 7.0

The expected minibatch OT plan converges to the true OT plan with quantifiable bias and convergence rates, yielding a regular velocity field for unique flows from source to discrete target in flow matching.
$h$-control: Training-Free Camera Control via Block-Conditional Gibbs Refinement
cs.CV 2026-05 unverdicted novelty 7.0

h-control introduces block-conditional pseudo-Gibbs refinement for training-free camera control in flow-matching video generators, achieving superior FVD scores on RealEstate10K and DAVIS benchmarks.
One-Step Generative Modeling via Wasserstein Gradient Flows
cs.LG 2026-05 conditional novelty 7.0

W-Flow achieves state-of-the-art one-step ImageNet 256x256 generation at 1.29 FID by training a static neural network to follow a Wasserstein gradient flow that minimizes Sinkhorn divergence, delivering roughly 100x f...
SABER: A Scalable Action-Based Embodied Dataset for Real-World VLA Adaptation
cs.RO 2026-05 unverdicted novelty 7.0

SABER provides 44.8K multi-representation action samples from unscripted retail environments that raise a VLA model's mean success rate on ten manipulation tasks from 13.4% to 29.3%.
Physics-Informed Neural PDE Solvers via Spatio-Temporal MeanFlow
cs.LG 2026-05 unverdicted novelty 7.0

Spatio-Temporal MeanFlow adapts MeanFlow to PDEs by replacing the generative velocity field with the physical operator and extending the integral constraint to the spatio-temporal domain, yielding a unified solver for...
Quantile-Coupled Flow Matching for Distributional Reinforcement Learning
cs.LG 2026-05 conditional novelty 7.0

FlowIQN is a quantile-coupled CFM critic that yields the first explicit Wasserstein-aligned approximate projection for distributional RL, with improved return-distribution accuracy and competitive offline RL performance.
Geometry-Aware Discretization Error of Diffusion Models
cs.LG 2026-05 unverdicted novelty 7.0

First-order asymptotic expansions of weak and Fréchet discretization errors in diffusion sampling are derived, explicit under Gaussian data through covariance geometry and robust to other data geometries.
Arena as Offline Reward: Efficient Fine-Grained Preference Optimization for Diffusion Models
cs.CV 2026-05 unverdicted novelty 7.0

ArenaPO infers Gaussian capability distributions from pairwise preferences and applies truncated-normal latent inference to derive fine-grained offline rewards for preference optimization of text-to-image diffusion models.
SDFlow: Similarity-Driven Flow Matching for Time Series Generation
cs.AI 2026-05 unverdicted novelty 7.0

SDFlow uses similarity-driven flow matching with low-rank manifold decomposition and a categorical posterior to generate high-fidelity long time series in VQ space without step-wise error accumulation.
PerFlow: Physics-Embedded Rectified Flow for Efficient Reconstruction and Uncertainty Quantification of Spatiotemporal Dynamics
cs.LG 2026-05 unverdicted novelty 7.0

PerFlow embeds physics constraints into rectified flow sampling through guidance-free conditioning and constraint-preserving projections, achieving efficient sparse reconstruction and uncertainty quantification for sp...
Mixture Prototype Flow Matching for Open-Set Supervised Anomaly Detection
cs.CV 2026-05 unverdicted novelty 7.0

MPFM uses flow matching with a Gaussian mixture prior on the velocity field and a mutual information maximizer to improve open-set anomaly detection over unimodal prototype methods.
DirectEdit: Step-Level Accurate Inversion for Flow-Based Image Editing
cs.CV 2026-05 unverdicted novelty 7.0

DirectEdit achieves step-level accurate inversion for flow-based image editing by directly aligning forward paths, using attention feature injection and mask-guided noise blending to balance fidelity and editability w...
Generative Modeling with Orbit-Space Particle Flow Matching
cs.GR 2026-05 unverdicted novelty 7.0

OGPP is a particle flow-matching method using orbit-space canonicalization and geometric paths that achieves lower error and fewer steps than prior approaches on 3D benchmarks.
Towards Efficient and Expressive Offline RL via Flow-Anchored Noise-conditioned Q-Learning
cs.LG 2026-05 unverdicted novelty 7.0

FAN achieves state-of-the-art offline RL performance on robotic tasks by anchoring flow policies and using single-sample noise-conditioned Q-learning, with proven convergence and reduced runtimes.
Hydra-DP3: Frequency-Aware Right-Sizing of 3D Diffusion Policies for Visuomotor Control
cs.RO 2026-05 conditional novelty 7.0

Frequency analysis of smooth robot actions bounds denoising error to low-frequency modes, enabling a sub-1% parameter 3D diffusion policy with two-step inference that reaches SOTA on manipulation benchmarks.
CoFlow: Coordinated Few-Step Flow for Offline Multi-Agent Decision Making
cs.AI 2026-05 unverdicted novelty 7.0

CoFlow achieves state-of-the-art coordination quality in offline MARL using only 1-3 denoising steps by natively coupling velocity fields across agents via coordinated attention and gating.
TimeTok: Granularity-Controllable Time-Series Generation via Hierarchical Tokenization
cs.AI 2026-05 unverdicted novelty 7.0

TimeTok is a unified framework using hierarchical tokenization for granularity-controllable time-series generation that achieves state-of-the-art performance in standard tasks and shows transferability across heteroge...
How to Guide Your Flow: Few-Step Alignment via Flow Map Reward Guidance
cs.LG 2026-04 unverdicted novelty 7.0

FMRG is a training-free, single-trajectory guidance method for flow models derived from optimal control that achieves strong reward alignment with only 3 NFEs.
Oracle Noise: Faster Semantic Spherical Alignment for Interpretable Latent Optimization
cs.CV 2026-04 unverdicted novelty 7.0

Oracle Noise optimizes diffusion model noise on a Riemannian hypersphere guided by key prompt words to preserve the Gaussian prior, eliminate norm inflation, and achieve faster semantic alignment than Euclidean methods.
$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models
cs.CV 2026-04 unverdicted novelty 7.0

Z²-Sampling implicitly realizes zero-cost zigzag trajectories for curvature-aware semantic alignment in diffusion models by reducing multi-step paths via operator dualities and temporal caching while synthesizing a di...
ML-Guided Primal Heuristics for Mixed Binary Quadratic Programs
cs.LG 2026-04 unverdicted novelty 7.0

New neural architectures and combined contrastive plus weighted cross-entropy losses let ML models predict good solutions for MBQPs and beat existing primal heuristics and solvers on benchmarks and a wind-farm task.
HP-Edit: A Human-Preference Post-Training Framework for Image Editing
cs.CV 2026-04 unverdicted novelty 7.0

HP-Edit introduces a post-training framework and RealPref-50K dataset that uses a VLM-based HP-Scorer to align diffusion image editing models with human preferences, improving outputs on Qwen-Image-Edit-2509.
Learning to Credit the Right Steps: Objective-aware Process Optimization for Visual Generation
cs.CV 2026-04 unverdicted novelty 7.0

OTCA improves GRPO training for visual generation by estimating step importance in trajectories and adaptively weighting multiple reward objectives.
Generative Texture Filtering
cs.CV 2026-04 unverdicted novelty 7.0

A two-stage fine-tuning strategy on pre-trained generative models enables effective texture filtering that outperforms prior methods on challenging cases.
Self-Improving Tabular Language Models via Iterative Group Alignment
cs.LG 2026-04 unverdicted novelty 7.0

TabGRAA enables self-improving tabular language models through iterative group-relative advantage alignment using modular automated quality signals like distinguishability classifiers.
Long-Text-to-Image Generation via Compositional Prompt Decomposition
cs.CV 2026-04 unverdicted novelty 7.0

PRISM lets pre-trained text-to-image models handle long prompts by breaking them into compositional parts, predicting noise separately, and merging outputs via energy-based conjunction, matching fine-tuned models whil...
Grokking of Diffusion Models: Case Study on Modular Addition
cs.LG 2026-04 unverdicted novelty 7.0

Diffusion models show grokking on modular addition by composing periodic operand representations in simple data regimes or by separating arithmetic computation from visual denoising across timesteps in varied regimes.
UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models
cs.CV 2026-04 unverdicted novelty 7.0

UniGeo unifies geometric guidance across three levels in video models to reduce geometric drift and improve consistency in camera-controllable image editing.
Efficient Video Diffusion Models: Advancements and Challenges
cs.CV 2026-04 unverdicted novelty 7.0

A survey that groups efficient video diffusion methods into four paradigms—step distillation, efficient attention, model compression, and cache/trajectory optimization—and outlines open challenges for practical use.
LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories
cs.CV 2026-04 unverdicted novelty 7.0

LeapAlign fine-tunes flow matching models by constructing two consecutive leaps that skip multiple ODE steps with randomized timesteps and consistency weighting, enabling stable updates at any generation step.
Beyond Prompts: Unconditional 3D Inversion for Out-of-Distribution Shapes
cs.CV 2026-04 unverdicted novelty 7.0

Text-to-3D models lose prompt sensitivity for out-of-distribution shapes due to sink traps but retain geometric diversity via unconditional priors, enabling a decoupled inversion method for robust editing.
LayerCache: Exploiting Layer-wise Velocity Heterogeneity for Efficient Flow Matching Inference
cs.CV 2026-04 unverdicted novelty 7.0

LayerCache enables per-layer-group caching in flow matching models via adaptive JVP span selection and greedy 3D scheduling, delivering 1.37x speedup with PSNR 37.46 dB, SSIM 0.9834, and LPIPS 0.0178 on Qwen-Image.
Any 3D Scene is Worth 1K Tokens: 3D-Grounded Representation for Scene Generation at Scale
cs.CV 2026-04 unverdicted novelty 7.0

A 3D-grounded autoencoder and diffusion transformer allow direct generation of 3D scenes in an implicit latent space using a fixed 1K-token representation for arbitrary views and resolutions.
GeRM: A Generative Rendering Model From Physically Realistic to Photorealistic
cs.CV 2026-04 unverdicted novelty 7.0

GeRM learns a distribution transfer vector field to convert PBR images into photorealistic ones using a multi-condition ControlNet guided by G-buffers and text prompts.
Physically Grounded 3D Generative Reconstruction under Hand Occlusion using Proprioception and Multi-Contact Touch
cs.CV 2026-04 unverdicted novelty 7.0

A conditional diffusion model using proprioception and multi-contact touch produces metric-scale, physically consistent 3D object reconstructions under hand occlusion.
Large-Scale Universal Defect Generation: Foundation Models and Datasets
cs.CV 2026-04 unverdicted novelty 7.0

A 300K quadruplet dataset and UniDG foundation model enable reference- or text-driven defect generation across categories, outperforming few-shot baselines on anomaly detection tasks.
AniGen: Unified $S^3$ Fields for Animatable 3D Asset Generation
cs.GR 2026-04 unverdicted novelty 7.0

AniGen directly generates animatable 3D assets with consistent shape, skeleton, and skinning from single images using unified S^3 fields and a two-stage flow-matching pipeline.
Score Shocks: The Burgers Equation Structure of Diffusion Generative Models
cond-mat.stat-mech 2026-04 unverdicted novelty 7.0

The score in diffusion models obeys viscous Burgers dynamics, with binary mode boundaries producing a universal tanh interfacial profile whose sharpening marks speciation transitions.
Isokinetic Flow Matching for Pathwise Straightening of Generative Flows
cs.LG 2026-04 unverdicted novelty 7.0

Isokinetic Flow Matching adds a lightweight regularization term to flow matching that penalizes acceleration along paths via self-guided finite differences, yielding straighter trajectories and large gains in few-step...
DiffusionNFT: Online Diffusion Reinforcement with Forward Process
cs.LG 2025-09 unverdicted novelty 7.0

DiffusionNFT performs online RL for diffusion models on the forward process via flow matching and positive-negative contrasts, delivering up to 25x efficiency gains and rapid benchmark improvements over prior reverse-...
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
cs.AI 2025-07 unverdicted novelty 7.0

MixGRPO speeds up GRPO for flow-based image generators by restricting SDE sampling and optimization to a sliding window while using ODE elsewhere, cutting training time by up to 71% with better alignment performance.
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
cs.CV 2023-10 unverdicted novelty 7.0

Latent Consistency Models enable high-fidelity text-to-image generation in 2-4 steps by directly predicting solutions to the probability flow ODE in latent space, distilled from pre-trained LDMs.
Reinforcing VLAs in Task-Agnostic World Models
cs.AI 2026-05 unverdicted novelty 6.0

RAW-Dream lets VLAs learn new tasks in zero-shot imagination by using a world model pre-trained only on task-free behaviors and an unmodified VLM to supply rewards, with dual-noise verification to limit hallucinations.
EPIC: Efficient Predicate-Guided Inference-Time Control for Compositional Text-to-Image Generation
cs.CV 2026-05 unverdicted novelty 6.0

EPIC introduces predicate-guided inference-time search that lifts compositional T2I prompt accuracy from 34% to 71% on GenEval2 with 31-81% lower execution costs.
Operator Spectroscopy of Trained Lattice Samplers
hep-lat 2026-05 unverdicted novelty 6.0

Operator projections of trained sampler functions in 2D phi^4 lattice theory decompose residuals into zero-mode Binder and finite-k correlator components, distinguishing flow-matching, diffusion, and normalizing-flow models.
SF-Flow: Sound field magnitude estimation via flow matching guided by sparse measurements
eess.AS 2026-05 unverdicted novelty 6.0

SF-Flow applies flow matching with a permutation-invariant set encoder and 3D U-Net to reconstruct ATF magnitudes from sparse inputs, showing accurate results up to 1 kHz with faster training than autoencoder baselines.
Fashion130K: An E-commerce Fashion Dataset for Outfit Generation with Unified Multi-modal Condition
cs.CV 2026-05 unverdicted novelty 6.0

Fashion130K dataset and UMC framework align text and visual prompts with embedding refiner, Fusion Transformer, and redesigned attention to generate more consistent outfits than prior methods.
Learning Generative Dynamics with Soft Law Constraints: A McKean-Vlasov FBSDE Approach
math.OC 2026-05 unverdicted novelty 6.0

A McKean-Vlasov FBSDE generative model learns stochastic path laws that match observed terminal and time-marginal distributions via soft energy constraints rather than hard interpolation.
Discrete Flow Matching: Convergence Guarantees Under Minimal Assumptions
cs.LG 2026-05 unverdicted novelty 6.0

Discrete flow matching on Z_m^d achieves non-asymptotic KL bounds for early-stopped targets and explicit TV convergence to the true target under an approximation error assumption, with improved scaling in dimension d ...
ACWM-Phys: Investigating Generalized Physical Interaction in Action-Conditioned Video World Models
cs.CV 2026-05 unverdicted novelty 6.0

ACWM-Phys benchmark shows action-conditioned world models generalize on simple geometric interactions but drop sharply on deformable contacts, high-dimensional control, and complex articulated motion, indicating relia...
Slowly Annealed Langevin Dynamics: Theory and Applications to Training-Free Guided Generation
cs.LG 2026-05 unverdicted novelty 6.0

Slowly Annealed Langevin Dynamics provides non-asymptotic KL-based convergence guarantees for tracking moving targets and enables training-free guided generation via a velocity-aware correction that accounts for pretr...

Reference graph

Works this paper leans on

100 extracted references · 100 canonical work pages · cited by 138 Pith papers · 12 internal anchors

[1]

Existence, uniqueness, stability and differentiability properties of the ﬂow associated to weakly differentiable vector ﬁelds

Luigi Ambrosio and Gianluca Crippa. Existence, uniqueness, stability and differentiability properties of the ﬂow associated to weakly differentiable vector ﬁelds. In Transport equations and multi-D hyperbolic conservation laws, pages 3–57. Springer, 2008

work page 2008
[2]

Lectures on optimal transport

Luigi Ambrosio, Elia Bru ´e, and Daniele Semola. Lectures on optimal transport. Springer, 2021

work page 2021
[3]

Reverse-time diffusion equation models

Brian DO Anderson. Reverse-time diffusion equation models. Stochastic Processes and their Appli- cations, 12(3):313–326, 1982

work page 1982
[4]

Wasserstein generative adversarial networks

Martin Arjovsky, Soumith Chintala, and L ´eon Bottou. Wasserstein generative adversarial networks. In International conference on machine learning, pages 214–223. PMLR, 2017

work page 2017
[5]

Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models, 2022

Fan Bao, Chongxuan Li, Jun Zhu, and Bo Zhang. Analytic-DPM: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. arXiv preprint arXiv:2201.06503, 2022

work page arXiv 2022
[6]

Neural ordinary differ- ential equations

Ricky TQ Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. Neural ordinary differ- ential equations. Advances in neural information processing systems, 31, 2018

work page 2018
[7]

arXiv preprint arXiv:2110.11291 , year=

Tianrong Chen, Guan-Horng Liu, and Evangelos A Theodorou. Likelihood training of Schr ¨odinger bridge using forward-backward sdes theory. arXiv preprint arXiv:2110.11291, 2021

work page arXiv 2021
[8]

Ilvr: Conditioning method for denoising diffusion probabilistic models

Jooyoung Choi, Sungwon Kim, Yonghyun Jeong, Youngjune Gwon, and Sungroh Yoon. Ilvr: Condi- tioning method for denoising diffusion probabilistic models. arXiv preprint arXiv:2108.02938, 2021

work page arXiv 2021
[9]

StarGAN v2: Diverse image synthesis for multiple domains

Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. StarGAN v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8188–8197, 2020

work page 2020
[10]

Score-based generative neural networks for large-scale optimal transport

Max Daniels, Tyler Maunu, and Paul Hand. Score-based generative neural networks for large-scale optimal transport. Advances in neural information processing systems, 34:12955–12965, 2021

work page 2021
[11]

Diffusion Schr ¨odinger bridge with applications to score-based generative modeling

Valentin De Bortoli, James Thornton, Jeremy Heng, and Arnaud Doucet. Diffusion Schr ¨odinger bridge with applications to score-based generative modeling. Advances in Neural Information Pro- cessing Systems, 34, 2021

work page 2021
[12]

Diffusion models beat GANs on image synthesis

Prafulla Dhariwal and Alexander Nichol. Diffusion models beat GANs on image synthesis. Advances in Neural Information Processing Systems, 34, 2021

work page 2021
[13]

NICE: Non-linear Independent Components Estimation

Laurent Dinh, David Krueger, and Yoshua Bengio. Nice: Non-linear independent components esti- mation. arXiv preprint arXiv:1410.8516, 2014

work page internal anchor Pith review arXiv 2014
[14]

Density estimation using Real NVP

Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real nvp. arXiv preprint arXiv:1605.08803, 2016

work page internal anchor Pith review arXiv 2016
[15]

An Invitation to Optimal Transport, Wasserstein Distances, and Gradient Flows

Alessio Figalli and Federico Glaudo. An Invitation to Optimal Transport, Wasserstein Distances, and Gradient Flows. 2021

work page 2021
[16]

Optimal transport for domain adaptation.IEEE Trans

R Flamary, N Courty, D Tuia, and A Rakotomamonjy. Optimal transport for domain adaptation.IEEE Trans. Pattern Anal. Mach. Intell, 1, 2016. 30

work page 2016
[17]

An entropy approach to the time reversal of diffusion processes

Hans F ¨ollmer. An entropy approach to the time reversal of diffusion processes. In Stochastic Differ- ential Systems Filtering and Control, pages 156–163. Springer, 1985

work page 1985
[18]

How much is enough? a study on diffusion times in score-based genera- tive models

Giulio Franzese, Simone Rossi, Lixuan Yang, Alessandro Finamore, Dario Rossi, Maurizio Filip- pone, and Pietro Michiardi. How much is enough? a study on diffusion times in score-based genera- tive models. arXiv preprint arXiv:2206.05173, 2022

work page arXiv 2022
[19]

Generative adversarial nets

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. Advances in neural information processing systems, 27, 2014

work page 2014
[20]

In search of lost domain generalization.arXiv preprint arXiv:2007.01434,

Ishaan Gulrajani and David Lopez-Paz. In search of lost domain generalization. arXiv preprint arXiv:2007.01434, 2020

work page arXiv 2007
[21]

Flexible diffusion modeling of long videos

William Harvey, Saeid Naderiparizi, Vaden Masrani, Christian Weilbach, and Frank Wood. Flexible diffusion modeling of long videos. arXiv preprint arXiv:2205.11495, 2022

work page arXiv 2022
[22]

Time reversal of diffusions

Ulrich G Haussmann and Etienne Pardoux. Time reversal of diffusions. The Annals of Probability, pages 1188–1205, 1986

work page 1986
[23]

Denoising diffusion probabilistic models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020

work page 2020
[24]

Video Diffusion Models

Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, and David J Fleet. Video diffusion models. arXiv preprint arXiv:2204.03458, 2022

work page internal anchor Pith review arXiv 2022
[25]

Image-to-image translation with conditional adversarial networks

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017

work page 2017
[26]

TransGAN: Two pure transformers can make one strong GAN, and that can scale up

Yifan Jiang, Shiyu Chang, and Zhangyang Wang. TransGAN: Two pure transformers can make one strong GAN, and that can scale up. Advances in Neural Information Processing Systems, 34, 2021

work page 2021
[27]

Progressive growing of GANs for improved quality, stability, and variation

Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of GANs for improved quality, stability, and variation. In International Conference on Learning Representations, 2018

work page 2018
[28]

Train- ing generative adversarial networks with limited data

Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. Train- ing generative adversarial networks with limited data. Advances in Neural Information Processing Systems, 33:12104–12114, 2020

work page 2020
[29]

Elucidating the Design Space of Diffusion-Based Generative Models

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion- based generative models. arXiv preprint arXiv:2206.00364, 2022

work page internal anchor Pith review arXiv 2022
[30]

Understanding DDPM latent codes through optimal transport

Valentin Khrulkov and Ivan Oseledets. Understanding DDPM latent codes through optimal transport. arXiv preprint arXiv:2202.07477, 2022

work page arXiv 2022
[31]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. 31

work page internal anchor Pith review Pith/arXiv arXiv 2014
[32]

Auto-Encoding Variational Bayes

Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[33]

Diffwave: A versatile diffusion model for audio synthesis

Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, and Bryan Catanzaro. Diffwave: A versatile diffusion model for audio synthesis. In International Conference on Learning Representations, 2020

work page 2020
[34]

Do neural optimal transport solvers work? a continuous wasserstein-2 benchmark

Alexander Korotin, Lingxiao Li, Aude Genevay, Justin M Solomon, Alexander Filippov, and Evgeny Burnaev. Do neural optimal transport solvers work? a continuous wasserstein-2 benchmark. Ad- vances in Neural Information Processing Systems, 34:14593–14605, 2021

work page 2021
[35]

arXiv preprint arXiv:2201.12220 , year=

Alexander Korotin, Daniil Selikhanovych, and Evgeny Burnaev. Neural optimal transport. arXiv preprint arXiv:2201.12220, 2022

work page arXiv 2022
[36]

Learning multiple layers of features from tiny images

Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009

work page 2009
[37]

Equivalence of stochastic equations and martingale problems

Thomas G Kurtz. Equivalence of stochastic equations and martingale problems. InStochastic analysis 2010, pages 113–130. Springer, 2011

work page 2010
[38]

Improved pre- cision and recall metric for assessing generative models

Tuomas Kynk ¨a¨anniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, and Timo Aila. Improved pre- cision and recall metric for assessing generative models. Advances in Neural Information Processing Systems, 32, 2019

work page 2019
[39]

The ﬂow map of the fokker–planck equation does not provide optimal transport

Hugo Lavenant and Filippo Santambrogio. The ﬂow map of the fokker–planck equation does not provide optimal transport. Applied Mathematics Letters, page 108225, 2022

work page 2022
[40]

Nu-wave: A diffusion probabilistic model for neural audio upsam- pling

Junhyeok Lee and Seungu Han. Nu-wave: A diffusion probabilistic model for neural audio upsam- pling. arXiv preprint arXiv:2104.02321, 2021

work page arXiv 2021
[41]

Diffusion-lm improves controllable text generation

Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, and Tatsunori B Hashimoto. Diffusion- lm improves controllable text generation. arXiv preprint arXiv:2205.14217, 2022

work page arXiv 2022
[42]

On rectiﬁed ﬂow and optimal coupling

Qiang Liu. On rectiﬁed ﬂow and optimal coupling. preprint, 2022

work page 2022
[43]

Fusedream: Training-free text-to-image generation with improved clip+ gan space optimization

Xingchao Liu, Chengyue Gong, Lemeng Wu, Shujian Zhang, Hao Su, and Qiang Liu. Fusedream: Training-free text-to-image generation with improved clip+ gan space optimization. arXiv preprint arXiv:2112.01573, 2021

work page arXiv 2021
[44]

Let us build bridges: Understanding and extending diffusion generative models

Xingchao Liu, Lemeng Wu, Mao Ye, and Qiang Liu. Let us build bridges: Understanding and extending diffusion generative models. arXiv preprint arXiv:2208.14699, 2022

work page arXiv 2022
[45]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[46]

Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. DPM-solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. arXiv preprint arXiv:2206.00927, 2022

work page arXiv 2022
[47]

Luhman and T

Eric Luhman and Troy Luhman. Knowledge distillation in iterative generative models for improved sampling speed. arXiv preprint arXiv:2101.02388, 2021

work page arXiv 2021
[48]

Accelerating diffusion models via early stop of the diffusion process

Zhaoyang Lyu, Xudong Xu, Ceyuan Yang, Dahua Lin, and Bo Dai. Accelerating diffusion models via early stop of the diffusion process. arXiv preprint arXiv:2205.12524, 2022. 32

work page arXiv 2022
[49]

Optimal transport mapping via input convex neural networks

Ashok Makkuva, Amirhossein Taghvaei, Sewoong Oh, and Jason Lee. Optimal transport mapping via input convex neural networks. In International Conference on Machine Learning, pages 6672–6681. PMLR, 2020

work page 2020
[50]

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

Chenlin Meng, Yang Song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu, and Stefano Ermon. Sdedit: Image synthesis and editing with stochastic differential equations. arXiv preprint arXiv:2108.01073, 2021

work page internal anchor Pith review arXiv 2021
[51]

Engel, Curtis Hawthorne, and Ian Simon

Gautam Mittal, Jesse Engel, Curtis Hawthorne, and Ian Simon. Symbolic music generation with diffusion models. arXiv preprint arXiv:2103.16091, 2021

work page arXiv 2021
[52]

Spectral normalization for generative adversarial networks

Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. Spectral normalization for generative adversarial networks. In International Conference on Learning Representations, 2018

work page 2018
[53]

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021

work page internal anchor Pith review arXiv 2021
[54]

Improved denoising diffusion probabilistic models

Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pages 8162–8171. PMLR, 2021

work page 2021
[55]

Ot-ﬂow: Fast and accurate contin- uous normalizing ﬂows via optimal transport

Derek Onken, Samy Wu Fung, Xingjian Li, and Lars Ruthotto. Ot-ﬂow: Fast and accurate contin- uous normalizing ﬂows via optimal transport. In Proceedings of the AAAI Conference on Artiﬁcial Intelligence, volume 35, pages 9223–9232, 2021

work page 2021
[56]

Normalizing ﬂows for probabilistic modeling and inference

George Papamakarios, Eric T Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, and Balaji Lak- shminarayanan. Normalizing ﬂows for probabilistic modeling and inference. J. Mach. Learn. Res., 22(57):1–64, 2021

work page 2021
[57]

Non-denoising forward-time diffusions

Stefano Peluchetti. Non-denoising forward-time diffusions. 2021

work page 2021
[58]

Moment matching for multi-source domain adaptation

Xingchao Peng, Qinxun Bai, Xide Xia, Zijun Huang, Kate Saenko, and Bo Wang. Moment matching for multi-source domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1406–1415, 2019

work page 2019
[59]

Computational optimal transport: With applications to data sci- ence

Gabriel Peyr ´e, Marco Cuturi, et al. Computational optimal transport: With applications to data sci- ence. Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019

work page 2019
[60]

Grad-tts: A diffusion probabilistic model for text-to-speech

Vadim Popov, Ivan V ovk, Vladimir Gogoryan, Tasnima Sadekova, and Mikhail Kudinov. Grad-tts: A diffusion probabilistic model for text-to-speech. In International Conference on Machine Learning, pages 8599–8608. PMLR, 2021

work page 2021
[61]

Hierarchical Text-Conditional Image Generation with CLIP Latents

Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text- conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022

work page internal anchor Pith review arXiv 2022
[62]

Variational inference with normalizing ﬂows

Danilo Rezende and Shakir Mohamed. Variational inference with normalizing ﬂows. In International conference on machine learning, pages 1530–1538. PMLR, 2015

work page 2015
[63]

arXiv preprint arXiv:2110.02999 , year =

Litu Rout, Alexander Korotin, and Evgeny Burnaev. Generative modeling with optimal transport maps. arXiv preprint arXiv:2110.02999, 2021. 33

work page arXiv 2021
[64]

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S Sara Mahdavi, Rapha Gontijo Lopes, et al. Photorealistic text-to-image diffusion models with deep language understanding. arXiv preprint arXiv:2205.11487, 2022

work page internal anchor Pith review arXiv 2022
[65]

Optimal transport for applied mathematicians

Filippo Santambrogio. Optimal transport for applied mathematicians. Birk¨auser, NY, 55(58-63):94, 2015

work page 2015
[66]

StyleGAN-XL: Scaling StyleGAN to large diverse datasets

Axel Sauer, Katja Schwarz, and Andreas Geiger. StyleGAN-XL: Scaling StyleGAN to large diverse datasets. In Special Interest Group on Computer Graphics and Interactive Techniques Conference Proceedings, pages 1–10, 2022

work page 2022
[67]

B., Flamary, R., Courty, N., Rolet, A., & Blondel, M

Vivien Seguy, Bharath Bhushan Damodaran, R ´emi Flamary, Nicolas Courty, Antoine Rolet, and Mathieu Blondel. Large-scale optimal transport and mapping estimation. arXiv preprint arXiv:1711.02283, 2017

work page arXiv 2017
[68]

D2C: Diffusion-decoding models for few-shot conditional generation

Abhishek Sinha, Jiaming Song, Chenlin Meng, and Stefano Ermon. D2C: Diffusion-decoding models for few-shot conditional generation. Advances in Neural Information Processing Systems, 34:12533– 12548, 2021

work page 2021
[69]

Super-convergence: Very fast training of neural networks using large learning rates

Leslie N Smith and Nicholay Topin. Super-convergence: Very fast training of neural networks using large learning rates. In Artiﬁcial intelligence and machine learning for multi-domain operations applications, volume 11006, pages 369–386. SPIE, 2019

work page 2019
[70]

Denoising diffusion implicit models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. In Interna- tional Conference on Learning Representations, 2020

work page 2020
[71]

Generative modeling by estimating gradients of the data distribution

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. Advances in Neural Information Processing Systems, 32, 2019

work page 2019
[72]

Improved techniques for training score-based generative models

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. Advances in neural information processing systems, 33:12438–12448, 2020

work page 2020
[73]

Score-based generative modeling through stochastic differential equations

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2020

work page 2020
[74]

Maximum likelihood training of score- based diffusion models

Yang Song, Conor Durkan, Iain Murray, and Stefano Ermon. Maximum likelihood training of score- based diffusion models. Advances in Neural Information Processing Systems, 34, 2021

work page 2021
[75]

Dual diffusion implicit bridges for image-to-image translation.arXiv preprint arXiv:2203.08382,

Xuan Su, Jiaming Song, Chenlin Meng, and Stefano Ermon. Dual diffusion implicit bridges for image-to-image translation. arXiv preprint arXiv:2203.08382, 2022

work page arXiv 2022
[76]

Deep coral: Correlation alignment for deep domain adaptation

Baochen Sun and Kate Saenko. Deep coral: Correlation alignment for deep domain adaptation. In European conference on computer vision, pages 443–450. Springer, 2016

work page 2016
[77]

Efﬁcientnet: Rethinking model scaling for convolutional neural net- works

Mingxing Tan and Quoc Le. Efﬁcientnet: Rethinking model scaling for convolutional neural net- works. In International conference on machine learning, pages 6105–6114. PMLR, 2019

work page 2019
[78]

Comparison of transport map generated by heat ﬂow interpolation and the optimal transport brenier map

Anastasiya Tanana. Comparison of transport map generated by heat ﬂow interpolation and the optimal transport brenier map. Communications in Contemporary Mathematics, 23(06):2050025, 2021. 34

work page 2021
[79]

Data-driven optimal transport

Giulio Trigila and Esteban G Tabak. Data-driven optimal transport. Communications on Pure and Applied Mathematics, 69(4):613–648, 2016

work page 2016
[80]

Theoretical guarantees for sampling and inference in generative models with latent diffusions

Belinda Tzen and Maxim Raginsky. Theoretical guarantees for sampling and inference in generative models with latent diffusions. In Conference on Learning Theory, pages 3084–3114. PMLR, 2019

work page 2019

Showing first 80 references.