pith. sign in

hub Canonical reference

One Step Diffusion via Shortcut Models

Canonical reference. 89% of citing Pith papers cite this work as background.

33 Pith papers citing it
Background 89% of classified citations
abstract

Diffusion models and flow-matching models have enabled generating diverse and realistic images by learning to transfer noise to data. However, sampling from these models involves iterative denoising over many neural network passes, making generation slow and expensive. Previous approaches for speeding up sampling require complex training regimes, such as multiple training phases, multiple networks, or fragile scheduling. We introduce shortcut models, a family of generative models that use a single network and training phase to produce high-quality samples in a single or multiple sampling steps. Shortcut models condition the network not only on the current noise level but also on the desired step size, allowing the model to skip ahead in the generation process. Across a wide range of sampling step budgets, shortcut models consistently produce higher quality samples than previous approaches, such as consistency models and reflow. Compared to distillation, shortcut models reduce complexity to a single network and training phase and additionally allow varying step budgets at inference time.

hub tools

citation-role summary

background 8 baseline 1

citation-polarity summary

years

2026 26 2025 7

representative citing papers

Isokinetic Flow Matching for Pathwise Straightening of Generative Flows

cs.LG · 2026-04-06 · unverdicted · novelty 7.0

Isokinetic Flow Matching adds a lightweight regularization term to flow matching that penalizes acceleration along paths via self-guided finite differences, yielding straighter trajectories and large gains in few-step sampling quality on CIFAR-10.

VOSR: A Vision-Only Generative Model for Image Super-Resolution

cs.CV · 2026-04-03 · conditional · novelty 7.0

VOSR shows that competitive generative image super-resolution with faithful structures can be achieved by training a diffusion-style model from scratch on visual data alone, using a vision encoder for guidance and a restoration-oriented sampling strategy.

Training Agents Inside of Scalable World Models

cs.AI · 2025-09-29 · conditional · novelty 7.0

Dreamer 4 is the first agent to obtain diamonds in Minecraft from only offline data by reinforcement learning inside a scalable world model that accurately predicts game mechanics.

Lipschitz-Guided Design of Interpolation Schedules in Generative Models

stat.ML · 2025-09-01 · unverdicted · novelty 7.0

Minimizing averaged squared Lipschitzness of the drift produces interpolation schedules that improve numerical accuracy and mitigate mode collapse in generative models, with closed-form optima for Gaussians and validation on stochastic PDEs.

Efficient Image Synthesis with Sphere Latent Encoder

cs.CV · 2026-05-15 · unverdicted · novelty 6.0

Decouples Sphere Encoder into fixed pretrained encoder and spherical latent denoiser, yielding higher quality and faster inference than the joint original on Animal-Faces, Oxford-Flowers and ImageNet-1K.

FlowS: One-Step Motion Prediction via Local Transport Conditioning

cs.RO · 2026-04-28 · unverdicted · novelty 6.0

FlowS achieves state-of-the-art single-step motion prediction on Waymo Open Motion Dataset by using scene-conditioned anchor trajectories and a step-consistent displacement field to make local transport accurate in one Euler step.

FASTER: Value-Guided Sampling for Fast RL

cs.LG · 2026-04-21 · unverdicted · novelty 6.0

FASTER models multi-candidate denoising as an MDP and trains a value function to filter actions early, delivering the performance of full sampling at lower cost in diffusion RL policies.

Self-Adversarial One Step Generation via Condition Shifting

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

APEX derives self-adversarial gradients from condition-shifted velocity fields in flow models to achieve high-fidelity one-step generation, outperforming much larger models and multi-step teachers.

Dual-End Consistency Model

cs.CV · 2026-02-11 · unverdicted · novelty 6.0

DE-CM reaches state-of-the-art one-step FID of 1.70 on ImageNet 256x256 by decomposing PF-ODE trajectories into three critical sub-trajectories and using flow matching plus N2N mapping for stability.

SAM 3D: 3Dfy Anything in Images

cs.CV · 2025-11-20 · unverdicted · novelty 6.0

SAM 3D reconstructs 3D objects from single images with geometry, texture, and pose using human-model annotated data at scale and synthetic-to-real training, achieving 5:1 human preference wins.

Real-Time Execution of Action Chunking Flow Policies

cs.RO · 2025-06-09 · unverdicted · novelty 6.0

Real-time chunking (RTC) allows diffusion- and flow-based action chunking policies to execute smoothly and asynchronously, maintaining high success rates on dynamic tasks even with significant inference latency.

citing papers explorer

Showing 33 of 33 citing papers.