Visual generation models are evolving from passive renderers to interactive agentic world modelers, but current systems lack spatial reasoning, temporal consistency, and causal understanding, with evaluations overemphasizing perceptual quality.
Editworld: Simulating world dynamics for instruction-following image editing.arXiv preprint arXiv:2405.14785, 2024a
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
DanceDuo applies diffusion models for music-synchronized dance generation and pose estimation for user-AI performance comparison, with a user study reporting positive feedback on usability.
citing papers explorer
-
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
Visual generation models are evolving from passive renderers to interactive agentic world modelers, but current systems lack spatial reasoning, temporal consistency, and causal understanding, with evaluations overemphasizing perceptual quality.
-
DanceDuo: Bridging Human Movement and AI Choreography
DanceDuo applies diffusion models for music-synchronized dance generation and pose estimation for user-AI performance comparison, with a user study reporting positive feedback on usability.