Improved distribution matching distillation for fast image synthesis.Advances in neural information processing systems, 37:47455–47487, 2024a

Shenghai Yuan et al · 2026 · arXiv 2603.04379

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

citation-role summary

background 3

citation-polarity summary

background 2 unclear 1

representative citing papers

MBench: A Comprehensive Benchmark on Memory Capability for Video World Models

cs.CV · 2026-05-30 · unverdicted · novelty 7.0

MBench is a new benchmark that quantifies long-term memory in video world models via three hierarchical consistency dimensions evaluated on curated real videos.

Future Forcing: Future-aware Training-free KV Cache Policy for Autoregressive Video Generation

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

Future Forcing constructs a future query proxy from historical pre-RoPE statistics to score and merge KV tokens, improving subject consistency by up to 1.49 on VBench-Long for 60s AR video generation.

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

LongLive-2.0 delivers an NVFP4 parallel infrastructure that enables direct training of long multi-shot autoregressive diffusion video models and achieves up to 2.15x training and 1.84x inference speedups on Blackwell and other GPUs.

MultiWorld: Scalable Multi-Agent Multi-View Video World Models

cs.CV · 2026-04-20 · unverdicted · novelty 7.0

MultiWorld is a scalable framework for multi-agent multi-view video world models that improves controllability and consistency over single-agent baselines in game and robot tasks.

Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

Warp-as-History enables zero-shot camera trajectory following in frozen video models by supplying camera-warped pseudo-history, with single-video LoRA fine-tuning improving generalization to unseen videos.

HorizonDrive: Self-Corrective Autoregressive World Model for Long-horizon Driving Simulation

cs.CV · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

HorizonDrive is a new anti-drifting autoregressive training and distillation method that enables minute-scale stable driving video rollouts by making the teacher model rollout-capable via scheduled rollout recovery and teacher rollout DMD.

Forcing-KV: Hybrid KV Cache Compression for Efficient Autoregressive Video Diffusion Models

cs.CV · 2026-05-10 · unverdicted · novelty 6.0

Forcing-KV applies head-specific static and dynamic pruning to KV caches in AR video diffusion models, achieving over 29 fps, 30% memory reduction, and up to 2.82x speedup at maintained quality.

Human Cognition in Machines: A Unified Perspective of World Models

cs.RO · 2026-04-17 · unverdicted · novelty 6.0

The paper introduces a unified framework for world models that fully incorporates all cognitive functions from Cognitive Architecture Theory, highlights under-researched areas in motivation and meta-cognition, and proposes Epistemic World Models as a new category for scientific discovery agents.

Focused Forcing: Content-Aware Per-Frame KV Selection for Efficient Autoregressive Video Diffusion

cs.CV · 2026-05-18 · unverdicted · novelty 5.0

Focused Forcing is a training-free per-frame KV selection method that combines attention scores with diversity metrics and head-importance estimation to accelerate autoregressive video diffusion up to 1.48x while improving quality.

MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation

cs.CV · 2026-05-19

citing papers explorer

Showing 10 of 10 citing papers.

MBench: A Comprehensive Benchmark on Memory Capability for Video World Models cs.CV · 2026-05-30 · unverdicted · none · ref 95
MBench is a new benchmark that quantifies long-term memory in video world models via three hierarchical consistency dimensions evaluated on curated real videos.
Future Forcing: Future-aware Training-free KV Cache Policy for Autoregressive Video Generation cs.CV · 2026-05-28 · unverdicted · none · ref 54
Future Forcing constructs a future query proxy from historical pre-RoPE statistics to score and merge KV tokens, improving subject consistency by up to 1.49 on VBench-Long for 60s AR video generation.
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation cs.CV · 2026-05-18 · unverdicted · none · ref 71
LongLive-2.0 delivers an NVFP4 parallel infrastructure that enables direct training of long multi-shot autoregressive diffusion video models and achieves up to 2.15x training and 1.84x inference speedups on Blackwell and other GPUs.
MultiWorld: Scalable Multi-Agent Multi-View Video World Models cs.CV · 2026-04-20 · unverdicted · none · ref 66
MultiWorld is a scalable framework for multi-agent multi-view video world models that improves controllability and consistency over single-agent baselines in game and robot tasks.
Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video cs.CV · 2026-05-14 · unverdicted · none · ref 15
Warp-as-History enables zero-shot camera trajectory following in frozen video models by supplying camera-warped pseudo-history, with single-video LoRA fine-tuning improving generalization to unseen videos.
HorizonDrive: Self-Corrective Autoregressive World Model for Long-horizon Driving Simulation cs.CV · 2026-05-12 · unverdicted · none · ref 27 · 2 links
HorizonDrive is a new anti-drifting autoregressive training and distillation method that enables minute-scale stable driving video rollouts by making the teacher model rollout-capable via scheduled rollout recovery and teacher rollout DMD.
Forcing-KV: Hybrid KV Cache Compression for Efficient Autoregressive Video Diffusion Models cs.CV · 2026-05-10 · unverdicted · none · ref 31
Forcing-KV applies head-specific static and dynamic pruning to KV caches in AR video diffusion models, achieving over 29 fps, 30% memory reduction, and up to 2.82x speedup at maintained quality.
Human Cognition in Machines: A Unified Perspective of World Models cs.RO · 2026-04-17 · unverdicted · none · ref 209
The paper introduces a unified framework for world models that fully incorporates all cognitive functions from Cognitive Architecture Theory, highlights under-researched areas in motivation and meta-cognition, and proposes Epistemic World Models as a new category for scientific discovery agents.
Focused Forcing: Content-Aware Per-Frame KV Selection for Efficient Autoregressive Video Diffusion cs.CV · 2026-05-18 · unverdicted · none · ref 52
Focused Forcing is a training-free per-frame KV selection method that combines attention scores with diversity metrics and head-importance estimation to accelerate autoregressive video diffusion up to 1.48x while improving quality.
MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation cs.CV · 2026-05-19 · unreviewed · ref 75

Improved distribution matching distillation for fast image synthesis.Advances in neural information processing systems, 37:47455–47487, 2024a

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer