Generative diffusion and flow models are constructed to remain exactly on the Lorentz-invariant massless N-particle phase space manifold during sampling for particle physics applications.
hub Mixed citations
Flow Matching Guide and Code
Mixed citation behavior. Most common role is background (60%).
abstract
Flow Matching (FM) is a recent framework for generative modeling that has achieved state-of-the-art performance across various domains, including image, video, audio, speech, and biological structures. This guide offers a comprehensive and self-contained review of FM, covering its mathematical foundations, design choices, and extensions. By also providing a PyTorch package featuring relevant examples (e.g., image and text generation), this work aims to serve as a resource for both novice and experienced researchers interested in understanding, applying and further developing FM.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Proposes diffeomorphic optimization for manifold-constrained problems in generative models via flow maps, with Lie-group extensions for protein design showing metric improvements.
Flow-Map GRPO uses anchored stochastic flow map composition to enable GRPO-based RL alignment of deterministic few-step flow-map generators while preserving their marginal paths.
Introduces structured DRO for learned inverse problem reconstructions with ambiguity sets aligned to the forward operator, yielding explicit dual representations and a worst-case bound that induces Tikhonov regularization on the operator Lipschitz constant.
PAINT reframes asynchronous flow-based action chunking as an initial noise selection problem solved via backward Euler inversion and a repainting rule.
Self-distillation from a caption-conditioned video diffusion model to an image-and-prompt-conditioned executor, enhanced by RL from VLM feedback, enables task solving in world models.
ActFlow expands the generable set of pre-trained flow models for out-of-distribution molecular and sequence design via active synthetic data generation and verifier feedback, with new statistical guarantees.
RLDT fine-tunes pretrained flow-matching policies for continuous control by aligning them to a max-entropy RL transport field constructed via SVGD, using expected-target estimation for stable multi-step updates.
Low-Pass Flow Matching modifies Flow Matching via an operator-modulated interpolant inducing time-varying spectral bias from source spectrum to frequency-decaying bias, improving or preserving quality while reducing sampling cost on unconditional image generation including Galaxy10.
Introduces a state-aligned latent actor-critic framework that lets diffusion models act as their own timestep-conditioned value functions for trajectory-level RL post-training and inference steering.
A control-theoretic linear program yields value-driven transport policies for generative modeling with straight paths and simulation-free training.
DiSI disentangles stochastic interpolants into separate generation and regression paths, allowing controllable transitions between regression and generative image restoration with a unified few-step sampler.
Discrete MeanFlow parameterizes CTMC conditional transition kernels with a boundary-by-construction design to enable exact one-step generation in discrete state spaces.
ALU uses public data to suppress unlearning cost quadratically while characterizing distribution mismatch effects, enabling mass unlearning with maintained utility.
FlowIQN is a quantile-coupled CFM critic that yields the first explicit Wasserstein-aligned approximate projection for distributional RL, with improved return-distribution accuracy and competitive offline RL performance.
MPFM models flow matching velocity as a Gaussian mixture prior per normal class plus a mutual information regularizer to improve open-set anomaly detection over unimodal prototypes.
OGPP is a particle flow-matching method using orbit-space canonicalization and geometric paths that achieves lower error and fewer steps than prior approaches on 3D benchmarks.
Binomial flows close the gap between continuous flow matching and discrete ordinal data by using binomial distributions to enable unified denoising, sampling, and exact likelihoods in diffusion models.
LeapAlign fine-tunes flow matching models by constructing two consecutive leaps that skip multiple ODE steps with randomized timesteps and consistency weighting, enabling stable updates at any generation step.
TokenLight encodes lighting attributes as tokens in a conditional image generation model trained mostly on synthetic data, enabling precise relighting control and implicit learning of light-scene interactions.
DoMinO reformulates discrete flow matching sampling as an MDP for unbiased RL fine-tuning with new TV regularizers, yielding better enhancer activity and naturalness on DNA design tasks.
DiNa-LRM introduces a diffusion-native latent reward model using a noise-calibrated Thurstone likelihood on noisy states, matching VLM performance at lower compute in image alignment and preference optimization.
Flow matching on time series targets a closed-form nonparametric velocity field that is a similarity-weighted mixture of observed transition velocities, making neural models approximations to an ideal memory-augmented dynamical system sampler.
Empirical flow matching introduces coupled biases from plug-in estimation, including altered statistical targets, non-gradient minimizers, and non-unique dynamics via flux-null fields, with base distribution controlling kinetic energy tails.
citing papers explorer
-
World Model Self-Distillation: Training World Models to Solve General Tasks
Self-distillation from a caption-conditioned video diffusion model to an image-and-prompt-conditioned executor, enhanced by RL from VLM feedback, enables task solving in world models.
-
Disentangling Generation and Regression in Stochastic Interpolants for Controllable Image Restoration
DiSI disentangles stochastic interpolants into separate generation and regression paths, allowing controllable transitions between regression and generative image restoration with a unified few-step sampler.
-
Mixture Prototype Flow Matching for Open-Set Supervised Anomaly Detection
MPFM models flow matching velocity as a Gaussian mixture prior per normal class plus a mutual information regularizer to improve open-set anomaly detection over unimodal prototypes.
-
LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories
LeapAlign fine-tunes flow matching models by constructing two consecutive leaps that skip multiple ODE steps with randomized timesteps and consistency weighting, enabling stable updates at any generation step.
-
TokenLight: Precise Lighting Control in Images using Attribute Tokens
TokenLight encodes lighting attributes as tokens in a conditional image generation model trained mostly on synthetic data, enabling precise relighting control and implicit learning of light-scene interactions.
-
Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling
DiNa-LRM introduces a diffusion-native latent reward model using a noise-calibrated Thurstone likelihood on noisy states, matching VLM performance at lower compute in image alignment and preference optimization.
-
Part$^{2}$GS: Part-aware Modeling of Articulated Objects using 3D Gaussian Splatting
Part²GS introduces a part-aware 3D Gaussian representation with physics-guided motion constraints and a repel point field for high-fidelity modeling of articulated objects.
-
Wavelet-Optimized Pseudo-3D Accelerated Diffusion Model for Truncated Computed Laminography
CL-DM combines 2D diffusion models with 3D MBIR, wavelet regularization, and fast sampling to eliminate truncation artifacts and restore continuous 3D structures in computed laminography beyond the FOV.
-
Lance: Unified Multimodal Modeling by Multi-Task Synergy
Lance presents a dual-stream mixture-of-experts model with modality-aware positional encoding and staged multi-task training that outperforms prior open-source unified models on image and video generation while keeping strong understanding performance.
-
TOPOS: High-Fidelity and Efficient Industry-Grade 3D Head Generation
TOPOS creates high-fidelity 3D heads with fixed industry topology from single images via a specialized VAE with Perceiver Resampler and a rectified flow transformer.
-
PhyMix: Towards Physically Consistent Single-Image 3D Indoor Scene Generation with Implicit--Explicit Optimization
PhyMix unifies a new multi-aspect physics evaluator with implicit policy optimization and explicit test-time correction to produce single-image 3D indoor scenes that are both visually faithful and physically plausible.
-
KM-Speaker: Keypoint-Based Style Control for High-Quality Speech-Driven 3D Facial Animation and Dialogue Localization
KM-Speaker introduces a keypoint-conditioned generative framework for speech-driven 3D facial animation offering global style guidance and frame-level temporal control via disentangled lip and upper-face dynamics.
-
Test-Time Adaptation in Optical Coherence Tomography Using Trajectory-Aligned Time-Independent Flow
Flow-matching TTA with histogram matching to synthetic reference trajectories and time-independent flow achieves SOTA segmentation of AMD biomarkers in OCT.
-
Nano World Models: A Minimalist Implementation of Future Video Prediction
Nano World Models supplies a unified minimalist codebase and evaluation framework for studying diffusion forcing in video prediction across control, games, and robot domains.
-
Exploring Motion-Language Alignment for Text-driven Motion Generation
MLA-Gen advances text-driven motion synthesis by aligning global motion patterns with fine-grained text semantics and mitigating attention sink effects via new masking techniques.
-
MedShift: Implicit Conditional Transport for X-Ray Domain Adaptation
MedShift applies flow matching and Schrödinger bridges for class-conditional unpaired translation between synthetic and real skull X-rays, benchmarked on the new X-DigiSkull dataset.