super hub Mixed citations

Flow Matching for Generative Modeling

Heli Ben-Hamu, Maximilian Nickel, Ricky TQ Chen, Yaron Lipman · 2022 · cs.LG · arXiv 2210.02747

Mixed citation behavior. Most common role is method (47%).

781 Pith papers citing it

Method 47% of classified citations

open full Pith review browse 781 citing papers more from Heli Ben-Hamu arXiv PDF

abstract

We introduce a new paradigm for generative modeling built on Continuous Normalizing Flows (CNFs), allowing us to train CNFs at unprecedented scale. Specifically, we present the notion of Flow Matching (FM), a simulation-free approach for training CNFs based on regressing vector fields of fixed conditional probability paths. Flow Matching is compatible with a general family of Gaussian probability paths for transforming between noise and data samples -- which subsumes existing diffusion paths as specific instances. Interestingly, we find that employing FM with diffusion paths results in a more robust and stable alternative for training diffusion models. Furthermore, Flow Matching opens the door to training CNFs with other, non-diffusion probability paths. An instance of particular interest is using Optimal Transport (OT) displacement interpolation to define the conditional probability paths. These paths are more efficient than diffusion paths, provide faster training and sampling, and result in better generalization. Training CNFs using Flow Matching on ImageNet leads to consistently better performance than alternative diffusion-based methods in terms of both likelihood and sample quality, and allows fast and reliable sample generation using off-the-shelf numerical ODE solvers.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 79 method 74 baseline 5

citation-polarity summary

use method 74 background 71 unclear 6 baseline 5 support 2

claims ledger

abstract We introduce a new paradigm for generative modeling built on Continuous Normalizing Flows (CNFs), allowing us to train CNFs at unprecedented scale. Specifically, we present the notion of Flow Matching (FM), a simulation-free approach for training CNFs based on regressing vector fields of fixed conditional probability paths. Flow Matching is compatible with a general family of Gaussian probability paths for transforming between noise and data samples -- which subsumes existing diffusion paths as specific instances. Interestingly, we find that employing FM with diffusion paths results in a more

authors

and Matt Le Heli Ben-Hamu Maximilian Nickel Ricky TQ Chen Yaron Lipman

co-cited works

representative citing papers

Test-time Adversarial Takeover: A Real-time Hijacking Interface against Robotic Diffusion Policies

cs.RO · 2026-06-09 · unverdicted · novelty 8.0

TAKO demonstrates real-time adversarial takeover of robotic diffusion policies via reusable universal patches on visual inputs, achieving 100% success in steering attacker-chosen trajectories across multiple tasks, encoders, and diffusion methods.

WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling

eess.AS · 2026-06-02 · unverdicted · novelty 8.0

WavTTS is the first raw-waveform diffusion TTS model using DiT flow matching and multi-scale mel supervision that approaches SOTA latent zero-shot performance while beating prior end-to-end models.

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

cs.CV · 2026-05-13 · unverdicted · novelty 8.0

AnyFlow enables any-step video diffusion by distilling flow-map transitions over arbitrary time intervals with on-policy backward simulation.

TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking

cs.CV · 2026-05-12 · unverdicted · novelty 8.0

TrackCraft3R is the first method to repurpose a video diffusion transformer as a feed-forward dense 3D tracker via dual-latent representations and temporal RoPE alignment, achieving SOTA performance with lower compute.

What Time Is It? How Data Geometry Makes Time Conditioning Optional for Flow Matching

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

Data geometry makes time identifiable from noisy interpolants at rate O(1/sqrt(d-k)), rendering the time-blindness gap asymptotically negligible relative to coupling variance.

Generative Modeling with Flux Matching

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

Flux Matching generalizes score-based generative modeling by using a weaker objective that admits infinitely many non-conservative vector fields with the data as stationary distribution, enabling new design choices beyond traditional score matching.

A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion

q-bio.QM · 2026-05-05 · unverdicted · novelty 8.0

A-CODE presents a fully atomic one-stage multimodal diffusion model for protein co-design that claims superior unconditional generation performance over prior one- and two-stage models plus a tenfold success-rate gain on hard binder-design tasks.

Divergence is Uncertainty: A Closed-Form Posterior Covariance for Flow Matching

cs.LG · 2026-05-01 · unverdicted · novelty 8.0 · 3 refs

Derives closed-form posterior covariance for flow matching from divergence of velocity field, enabling post-hoc uncertainty on pre-trained models including one-step generators.

How to Guide Your Flow: Few-Step Alignment via Flow Map Reward Guidance

cs.LG · 2026-04-29 · unverdicted · novelty 8.0 · 3 refs

FMRG reformulates guidance as deterministic optimal control, deriving a single-trajectory method using the flow map that matches or exceeds baselines on reward-guided generation and inverse problems with 3 NFEs at text-to-image scale.

ReConText3D: Replay-based Continual Text-to-3D Generation

cs.CV · 2026-04-15 · conditional · novelty 8.0

ReConText3D is the first replay-memory framework for continual text-to-3D generation that prevents catastrophic forgetting on new textual categories while preserving quality on previously seen classes.

Query Lower Bounds for Diffusion Sampling

cs.LG · 2026-04-12 · unverdicted · novelty 8.0

Diffusion sampling from d-dimensional distributions requires at least ~sqrt(d) adaptive score queries when score estimates have polynomial accuracy.

OP-GRPO: Efficient Off-Policy GRPO for Flow-Matching Models

cs.CV · 2026-04-05 · unverdicted · novelty 8.0

OP-GRPO is the first off-policy GRPO method for flow-matching models that reuses trajectories via replay buffer and importance sampling corrections, matching on-policy performance with 34.2% of the training steps.

Generative models on phase space

hep-ph · 2026-04-02 · unverdicted · novelty 8.0

Generative diffusion and flow models are constructed to remain exactly on the Lorentz-invariant massless N-particle phase space manifold during sampling for particle physics applications.

FlowHijack: A Dynamics-Aware Backdoor Attack on Flow-Matching Vision-Language-Action Models

cs.CV · 2026-03-30 · unverdicted · novelty 8.0

FlowHijack is the first dynamics-aware backdoor attack on flow-matching VLAs that achieves high success rates with stealthy triggers while preserving benign performance and making malicious actions kinematically indistinguishable from normal ones.

Flow-GRPO: Training Flow Matching Models via Online RL

cs.CV · 2025-05-08 · unverdicted · novelty 8.0

Flow-GRPO is the first online RL method for flow matching models, raising GenEval accuracy from 63% to 95% and text-rendering accuracy from 59% to 92% with little reward hacking.

Building Normalizing Flows with Stochastic Interpolants

cs.LG · 2022-09-30 · conditional · novelty 8.0 · 2 refs

Normalizing flows are constructed by learning the velocity of a stochastic interpolant via a quadratic loss derived from its probability current, yielding an efficient ODE-based alternative to diffusion models.

QWERTY: Training-Free Motion Control via Query-Warped Video Diffusion Transformers

cs.CV · 2026-07-02 · unverdicted · novelty 7.0

QWERTY enables training-free motion control in pretrained image-to-video DiTs by warping the frame-invariant semantic subspace of queries in 3D full attention and using the predicted noise as self-guidance for latent optimization.

Diffeomorphic Optimization

cs.LG · 2026-07-01 · unverdicted · novelty 7.0

Proposes diffeomorphic optimization for manifold-constrained problems in generative models via flow maps, with Lie-group extensions for protein design showing metric improvements.

Self-conditioned Flow Map Language Models via Fixed-point Flows

cs.CL · 2026-07-01 · unverdicted · novelty 7.0

Self-conditioned flow language models solve fixed-point iterations, enabling fixed-point flow maps that distill into FMLM* which outperforms SOTA in few-step generation on OpenWebText.

Flow-Map GRPO: Reinforcement Learning for Few-Step Flow-Map Generators via Anchored Stochastic Composition

cs.LG · 2026-07-01 · unverdicted · novelty 7.0

Flow-Map GRPO uses anchored stochastic flow map composition to enable GRPO-based RL alignment of deterministic few-step flow-map generators while preserving their marginal paths.

Cross-Space Distillation: Teaching One-Step Students with Modern Diffusion Teachers

cs.CV · 2026-06-30 · unverdicted · novelty 7.0

Introduces a Bridge latent interface that maps mismatched student latents into teacher space, enabling distillation from modern diffusion teachers to compact one-step students and raising SD 1.5 HPSv3 from 5.4 to 9.4 while keeping one-step speed.

OopsieVerse: A Safety Benchmark with Damage-Aware Simulation for Robot Manipulation

cs.RO · 2026-06-30 · unverdicted · novelty 7.0

OOPSIEVERSE is a new damage-aware simulation benchmark for household robot manipulation that converts contact, thermal, and fluid signals into task-agnostic damage metrics and demonstrates uses in safer policy learning and benchmarking.

FlexiSLM: A Dynamic and Controllable Frame Rate Spoken Language Model

cs.SD · 2026-06-30 · unverdicted · novelty 7.0

FlexiSLM is the first spoken language model supporting dynamic and controllable frame rates on speech input and output, outperforming fixed-rate 7B models at high quality and enabling faster inference at lower rates like 6.25 Hz.

MUSE: Unlocking Timestep as Native Task Steering for One-Step Dense Prediction

cs.CV · 2026-06-29 · unverdicted · novelty 7.0

MUSE shows that the native timestep embedding in diffusion models acts as a parameter-free steering signal for multi-task monocular depth and normal estimation via manifold decoupling in latent space.

citing papers explorer

Showing 50 of 781 citing papers.

Learning to Distort: Weakly-Supervised Image Quality Transfer for Prostate DWI Correction cs.CV · 2026-06-17 · unverdicted · none · ref 13 · internal anchor
A weakly-supervised image quality transfer method generates synthetic distorted DWI images from quality labels to train improved distortion correction models for prostate MRI.
OmniDrive: An LLM-Choreographed Multi-Agent World Model with Unified Latent Co-Compression for Multi-View Driving Video Generation cs.CV · 2026-06-16 · unverdicted · none · ref 29 · internal anchor
DRIVE-CHOREO uses three LLM agents to create a unified position-aware token sequence co-compressed with multi-view video, achieving SOTA BEV mAP of 21.6 and +2.4 NDS improvement on nuScenes.
Transformer-Based Warm-Starting for Feasible and Optimal Terminal Approach to Tumbling Objects with Space Manipulators cs.RO · 2026-06-15 · unverdicted · none · ref 29 · internal anchor
Transformer warm-starts cut SCP iterations by up to 28% and runtime by 23% for space manipulator terminal guidance while preserving cost distributions and improving feasibility projection robustness.
Improving Robotic Generalist Policies via Flow Reversal Steering cs.RO · 2026-06-11 · unverdicted · none · ref 14 · internal anchor
Flow Reversal Steering steers flow matching generalist policies by reversing suboptimal actions to nearby better modes, enabling improved zero-shot control, quick distillation, and RL bootstrapping in robotic manipulation.
Ambient Diffusion Policy: Imitation Learning from Suboptimal Data in Robotics cs.RO · 2026-06-10 · unverdicted · none · ref 58 · internal anchor
Ambient Diffusion Policy enables better imitation learning from suboptimal robot data by leveraging spectral properties to restrict data usage to specific diffusion times.
World Model Self-Distillation: Training World Models to Solve General Tasks cs.CV · 2026-06-10 · unverdicted · none · ref 38 · internal anchor
Self-distillation from a caption-conditioned video diffusion model to an image-and-prompt-conditioned executor, enhanced by RL from VLM feedback, enables task solving in world models.
Net-Ev$^2$: A Generative Simulator for Network Event Evolution cs.LG · 2026-06-10 · unverdicted · none · ref 25 · internal anchor
Net-Ev² proposes a two-stage generative simulator with structure-guided masked pre-training and topology-aware diffusion using graph U-Net down/upsampling to model network event evolution from text inputs, plus a new 6.5M multimodal benchmark and JL-MMD metric.
Flow Matching with In-Context Priors for Out-of-Distribution Brain Dynamics cs.LG · 2026-06-10 · unverdicted · none · ref 8 · internal anchor
A per-timestep conditioned diffusion transformer generates realistic fMRI dynamics for unseen cognitive tasks by injecting compositional language and optional spatial priors in-context.
Spectrally Regularized Latent Flow Matching for Turbulence Generation cs.LG · 2026-06-10 · unverdicted · none · ref 32 · internal anchor
Spectrally regularized compression in latent flow matching raises retained deep-dissipation spectral power from 20% to 79% in generated turbulence on a 256^2 DNS dataset at Re_f ≈ 2250.
Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning cs.LG · 2026-06-09 · unverdicted · none · ref 39 · internal anchor
QGF performs test-time policy optimization for flow models in RL by guiding a behavior-cloned reference policy with value-function gradients, achieving strong results on high-dimensional offline RL benchmarks without additional policy training.
EgoTactile: Learning Grasp Pressure for Everyday Objects from Egocentric Video cs.CV · 2026-06-08 · unverdicted · none · ref 31 · internal anchor
EgoTactile benchmark and EgoPressureDiff diffusion framework for estimating full-hand grasp pressure from egocentric video.
HoliDubber: Holistic Video Dubbing for Complex Acoustic Scenes via Text-Guided Audio Synthesis eess.AS · 2026-06-08 · unverdicted · none · ref 41 · internal anchor
HoliDubber introduces a patch-based autoregressive diffusion transformer for joint text-guided synthesis of speech and ambient audio in video dubbing, with a new benchmark showing outperformance over prior speech-only methods.
Continuous Language Diffusion as a Decoder-Interface Problem cs.CL · 2026-06-07 · unverdicted · none · ref 43 · internal anchor
Continuous language diffusion works by entering high-margin decoder basins where frozen T5 embeddings recover 93-96% of native decisions and linear readouts reach 97.9% agreement, implying models should be evaluated as representation-decoder systems.
MaskAlign: Token-Subset Representation Alignment for Efficient Diffusion Training cs.CV · 2026-06-07 · unverdicted · none · ref 13 · internal anchor
MaskAlign uses random token-subset alignment and pre-mask mixing to reduce diffusion models' reliance on complete clean-image token sets during representation alignment.
Reinforcement Learning for Flow-Matching Policies with Density Transport cs.LG · 2026-06-07 · unverdicted · none · ref 25 · internal anchor
RLDT fine-tunes pretrained flow-matching policies for continuous control by aligning them to a max-entropy RL transport field constructed via SVGD, using expected-target estimation for stable multi-step updates.
OmniTryOn: Video Try-On Anything at Once! cs.CV · 2026-06-07 · unverdicted · none · ref 31 · internal anchor
OmniTryOn performs multi-object video virtual try-on in one pass using first-frame wearable caching and spatiotemporal RoPE, outperforming single-garment baselines on a new TryAny-Bench dataset.
From Pixels to Newtons: Predicting In Vivo Joint Contact Forces from Monocular Video cs.CV · 2026-06-04 · unverdicted · none · ref 50 · internal anchor
A transformer model predicts in vivo hip and knee contact forces from uncalibrated monocular video at accuracy matching subject-specific musculoskeletal simulations under leave-one-subject-out validation.
Complexity-Balanced Diffusion Splitting cs.CV · 2026-06-04 · unverdicted · none · ref 22 · internal anchor
CBS partitions the diffusion timeline into segments of equal approximation burden via Dirichlet energy and trajectory acceleration monitors estimated by an auxiliary model, yielding higher synthesis quality at fixed per-step cost across SiT, JiT and UNet backbones.
Geodesic Flow Matching on a Riemannian Degradation Manifold for Blind Image Restoration cs.CV · 2026-06-04 · unverdicted · none · ref 20 · internal anchor
Introduces geodesic flow matching on a Riemannian degradation manifold to generalize linear flow matching for blind image restoration.
Diff-CA: Separating Common and Salient Factors with Diffusion Models cs.CV · 2026-06-04 · unverdicted · none · ref 32 · internal anchor
A diffusion-based contrastive analysis method that decomposes conditioning into common and salient factors with weak supervision and proves identifiability of the additive model.
Balancing Image Compression and Generation with Bootstrapped Tokenization cs.LG · 2026-06-04 · unverdicted · none · ref 2 · internal anchor
SelfBootTok decomposes image tokens into global and local groups via self-bootstrapped learning, enabling generators to use only global tokens for ~40% less computation and a new SOTA gFID of 1.56 with 64 tokens.
Imagine Before You Draw: Visual Prompt Engineering for Image Generation cs.CV · 2026-06-03 · unverdicted · none · ref 11 · internal anchor
VPE inserts an internal autoregressive visual semantic token generation step to guide image token production in unified models, reporting faster convergence, higher quality, and superior editing preservation (PSNR 26.76 vs 19.92) versus external alternatives.
SymTRELLIS: Symmetry-Enforced Voxel Latents for 3D Generation cs.GR · 2026-06-02 · unverdicted · none · ref 23 · internal anchor
SymTRELLIS enforces finite point-group symmetries during TRELLIS.2 generation via a learned linear latent-space mapper and velocity symmetrization, reducing symmetry errors on a 266-object benchmark while preserving reconstruction quality.
Optimal Transport Flow Matching by Design cs.CV · 2026-06-02 · unverdicted · none · ref 37 · internal anchor
By designing the prior as the low-frequency projection of data images, flow matching achieves OT-optimal identity couplings without explicit OT computation, reducing trajectory curvature over 2x and improving few-step quality.
VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization cs.CV · 2026-06-01 · unverdicted · none · ref 7 · 2 links · internal anchor
VLMs formulate differentiable rewards from task-specific rules to enable test-time online LoRA optimization of VGMs, delivering 16.7-point gains on symbolic and general video reasoning benchmarks over VLM-as-solver and Best-of-N baselines.
Decoupled Residual Denoising Diffusion Models for Unified and Data Efficient Image-to-Image Translation cs.CV · 2026-05-31 · unverdicted · none · ref 31 · internal anchor
DRDD decouples diffusion into independent noise and residual stages to preserve domain harmonization and enable unified data-efficient I2I translation.
FlowOVD: Learning Generative Latent Flows for Zero-shot Open-vocabulary Detection cs.CV · 2026-05-30 · unverdicted · none · ref 15 · internal anchor
FlowOVD applies rectified flow to generate continuous latent query dynamics for text-conditioned open-vocabulary detection, reporting 49.5 AP on COCO and 31.5 AP on LVIS.
Learning Global Motion with Compact Gaussians for Feed-Forward 4D Reconstruction cs.CV · 2026-05-29 · unverdicted · none · ref 61 · internal anchor
C4G introduces compact timestamp-conditioned Gaussian query tokens that aggregate full temporal context to decode 3D Gaussians with timestamp-modulated positions for feed-forward 4D reconstruction from monocular video, plus a diffusion-based rendering module and extension to 4D feature fields.
Free energy Estimation on Any State Space stat.ML · 2026-05-29 · unverdicted · none · ref 30 · internal anchor
Generalizes neural transport methods for free energy estimation to any state space with added algebraic and group-theoretic results on time reversal and h-transforms.
Midpoint Generative Models cs.LG · 2026-05-28 · unverdicted · none · ref 19 · internal anchor
Midpoint Generative Models define a midpoint divergence from flow matching symmetry and derive its variational form as a tractable objective for training competitive one-step generators.
A Unified Two-Stage Generative Diffusion Framework for Channel Estimation and Port Selection in Multiuser MIMO-FAS cs.IT · 2026-05-28 · unverdicted · none · ref 34 · internal anchor
A two-stage diffusion framework decomposes MAP inference for MIMO-FAS into continuous-flow channel estimation and discrete diffusion port selection, claiming superior accuracy and rates via simulations.
Orthogonal Negative Guidance in Attention Feature Space for Text-to-Image Generation cs.CV · 2026-05-28 · unverdicted · none · ref 34 · internal anchor
Orthogonal Negative Guidance subtracts only the orthogonal component of negative-prompt attention features from positive ones in FLUX models to suppress concepts while preserving semantics and quality.
Spectral Guidance for Flexible and Efficient Control of Diffusion Models cs.LG · 2026-05-27 · unverdicted · none · ref 10 · internal anchor
Spectral Guidance learns singular functions via self-supervised objective to project guidance signals onto diffusion sampling trajectories, enabling stable control without retraining or backpropagation and improving CIFAR-10 accuracy by 37 points with 4x faster sampling.
Parameter-Efficient Generative Modeling with Controlled Vector Fields cs.LG · 2026-05-27 · unverdicted · none · ref 3 · internal anchor
Presents a controlled vector field framework for continuous generative modeling where velocity is formed from fixed bracket-generating fields modulated by scalar controls, with an expressivity principle under controllability assumptions.
Paris 2.0: A Decentralized Diffusion Model for Video Generation cs.CV · 2026-05-25 · unverdicted · none · ref 5 · internal anchor
Paris 2.0 is the first decentralized diffusion model for text-to-video generation and reports roughly 2x lower FVD than a monolithic baseline under matched total compute.
Towards Anatomically Plausible Human Image Generation via Synthetic Localized Preferences cs.CV · 2026-05-25 · unverdicted · none · ref 17 · internal anchor
ASAP generates over 10K synthetic anatomical preference pairs via targeted degradation of high-fidelity images and applies a localized margin-bounded DPO to reduce anatomical errors in text-to-image human generation, supported by the new HAP dataset and HAF-Bench.
DRM: Diffusion-based Reward Model With Step-wise Guidance cs.CV · 2026-05-25 · unverdicted · none · ref 20 · internal anchor
DRM turns a pre-trained diffusion model into a step-wise reward model and uses it for dense RL training (Step-wise GRPO) and guided sampling to improve final image quality.
Action-Prior Denoising for Smooth Real-Time Chunking cs.RO · 2026-05-25 · unverdicted · none · ref 4 · internal anchor
Soft RTC uses partially denoised states for overlap tokens and token-wise blending to reduce action delta and jerk by ~9% versus hard RTC while matching solve rates on Kinetix levels.
Geo-Align: Video Generation Alignment via Metric Geometry Reward cs.CV · 2026-05-22 · unverdicted · none · ref 50 · internal anchor
Geo-Align applies RL with a perceptual reward derived from 3D camera trajectory estimation to improve controllability and fidelity in video generation without paired training data.
GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction cs.CV · 2026-05-22 · unverdicted · none · ref 60 · internal anchor
GenRecon lifts object-level generative priors to scene-scale reconstruction by chunking scenes and using projection-based conditioning on multi-view features, claiming 16% better results than prior methods.
VINS-120K: Ultra High-Resolution Image Editing with A Large-Scale Dataset cs.CV · 2026-05-22 · unverdicted · none · ref 28 · internal anchor
VINS-120K supplies the first large-scale set of instruction-image-edited-image triplets at ultra-high resolution together with an adaptation strategy that improves detail synthesis.
VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation cs.CV · 2026-05-22 · unverdicted · none · ref 25 · internal anchor
VDE accelerates rectified flow models like Flux by 3.22x with LPIPS of 0.069 via velocity decomposition into parallel/orthogonal components plus periodic full-pass anchoring.
Increasing the Precision of Surrogate Models for Weak Lensing Mass Maps with Flow Matching astro-ph.CO · 2026-05-22 · unverdicted · none · ref 24 · internal anchor
A flow matching generative model produces weak lensing mass maps with fidelity improved to below 1% and 5% on basic and higher-order statistics relative to GAN benchmarks.
Let EEG Models Learn EEG cs.CV · 2026-05-20 · unverdicted · none · ref 39 · internal anchor
JET is a conditional flow matching framework that generates EEG as continuous raw sequences with added constraints for spectral and temporal properties, achieving over 40% lower TS-FID than prior discrete denoising methods on three benchmarks.
Linear-DPO: Linear Direct Preference Optimization for Diffusion and Flow-Matching Generative Models cs.CV · 2026-05-20 · unverdicted · none · ref 14 · internal anchor
Linear-DPO replaces sigmoid utility with linear utility and adds EMA reference to improve preference alignment in diffusion and flow-matching text-to-image models.
DySink: Dynamic Frame Sinks for Autoregressive Long Video Generation cs.CV · 2026-05-20 · unverdicted · none · ref 10 · 2 links · internal anchor
DySink maintains a memory bank and retrieves relevant historical frames as dynamic sinks while using an anomaly gate to suppress collapse, yielding higher temporal quality and dynamic degree on minute-long videos.
Beyond the Bellman Recursion: A Pontryagin-Guided Framework for Non-Exponential Discounting cs.LG · 2026-05-20 · unverdicted · none · ref 52 · internal anchor
PG-DPO is a new variational framework that replaces Bellman recursion with a Pontryagin-guided adjoint-MC projection for RL under non-exponential discounting and shows gains on hyperbolic and survival benchmarks.
CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation cs.LG · 2026-05-20 · unverdicted · none · ref 2 · internal anchor
CAdam reinterprets densification in generative 3DGS as signal verification via gradient-moment interference, quantile context, and SNR gating to achieve large reductions in primitive count with comparable quality.
MetaEarth-MM: Unified Multimodal Remote Sensing Image Generation with Scene-centered Joint Modeling cs.CV · 2026-05-19 · conditional · none · ref 63 · internal anchor
MetaEarth-MM unifies multi-modal remote sensing image generation and any-to-any translation across five modalities via scene-centered joint modeling on the new EarthMM dataset.
Probability-Conserving Flow Guidance cs.CV · 2026-05-19 · unverdicted · none · ref 7 · internal anchor
AdaMaG is a guidance rule for generative models derived from decomposing continuity-equation effects into divergence and score-parallel terms, with a proof that divergence diverges near the manifold and a time-dependent bound that improves realism at no extra cost.

Flow Matching for Generative Modeling

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer