CT-1 transfers spatial reasoning from vision-language models to estimate camera trajectories, which are then used in a video diffusion model with wavelet regularization to produce controllable videos, claiming 25.7% better accuracy than prior methods.
Context as memory: Scene-consistent interactive long video generation with memory retrieval
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
McCast uses a Drift-Corrective Memory Bank to actively correct latent drift in autoregressive precipitation nowcasting for more coherent long-horizon forecasts.
citing papers explorer
-
CT-1: Vision-Language-Camera Models Transfer Spatial Reasoning Knowledge to Camera-Controllable Video Generation
CT-1 transfers spatial reasoning from vision-language models to estimate camera trajectories, which are then used in a video diffusion model with wavelet regularization to produce controllable videos, claiming 25.7% better accuracy than prior methods.
-
McCast: Memory-Guided Latent Drift Correction for Long-Horizon Precipitation Nowcasting
McCast uses a Drift-Corrective Memory Bank to actively correct latent drift in autoregressive precipitation nowcasting for more coherent long-horizon forecasts.