Energy-navigated trajectory shaping during training produces 8-step discrete flow matching students that achieve 32% lower perplexity than 1024-step teachers on 170M language models with unchanged inference cost.
Diffusion Models Beat GANs on Image Synthesis , url =
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
A compositional diffusion world model integrates three specialized memory experts via contrastive product-of-experts to improve temporal consistency, past recall, and navigation while scaling to long contexts without quadratic costs.
citing papers explorer
-
Trajectory as the Teacher: Few-Step Discrete Flow Matching via Energy-Navigated Distillation
Energy-navigated trajectory shaping during training produces 8-step discrete flow matching students that achieve 32% lower perplexity than 1024-step teachers on 170M language models with unchanged inference cost.
-
Composition of Memory Experts for Diffusion World Models
A compositional diffusion world model integrates three specialized memory experts via contrastive product-of-experts to improve temporal consistency, past recall, and navigation while scaling to long contexts without quadratic costs.