Robust Dreamer uses Latent Gaussian Memory anchored to diffusion latents and Deviation Learning with a Dynamic Deviation Archive to reduce drift in long-horizon action-controlled image-to-video generation, reporting SOTA results on ScanNet, DL3DV, and OmniWorldGame.
Error analyses of auto-regressive video diffusion models: A unified framework
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3verdicts
UNVERDICTED 3roles
background 1polarities
support 1representative citing papers
Self Forcing trains autoregressive video diffusion models by performing autoregressive rollout with KV caching during training to close the exposure bias gap, using a holistic video-level loss and few-step diffusion for efficiency.
MilliVid compresses video frames into multi-scale token hierarchies and uses coarse-to-fine rollout in a diffusion model to maintain long-range geometric and object consistency on Minecraft videos.
citing papers explorer
-
Robust Dreamer: Deviation-Aware Latent Gaussian Memory for Action-Controlled AR Video Generation
Robust Dreamer uses Latent Gaussian Memory anchored to diffusion latents and Deviation Learning with a Dynamic Deviation Archive to reduce drift in long-horizon action-controlled image-to-video generation, reporting SOTA results on ScanNet, DL3DV, and OmniWorldGame.
-
MilliVid: Hierarchical Latents for Long-Range Consistency in Video Generation
MilliVid compresses video frames into multi-scale token hierarchies and uses coarse-to-fine rollout in a diffusion model to maintain long-range geometric and object consistency on Minecraft videos.