Mimo: Controllable character video synthesis with spatial decomposed modeling.arXiv preprint arXiv:2409.16160

Yifang Men, Yuan Yao, Miaomiao Cui, Liefeng Bo · 2024 · arXiv 2409.16160

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Towards 3D-Aware Video Diffusion Models: Render-Free Human Motion Control with Mesh Tokenization

cs.CV · 2026-06-01 · unverdicted · novelty 6.0

Introduces mesh tokenization to condition DiT-based video diffusion models directly on 3D human meshes for motion control without 2D rendering.

VHOI: Controllable Video Generation of Human-Object Interactions from Sparse Trajectories via Motion Densification

cs.CV · 2025-12-10 · unverdicted · novelty 6.0

VHOI densifies sparse trajectories into color-encoded HOI mask sequences and conditions a fine-tuned video diffusion model on them to produce controllable human-object interaction videos, including full navigation sequences.

Evolution of Video Generative Foundations

cs.CV · 2026-04-07 · unverdicted · novelty 2.0

This survey traces video generation technology from GANs to diffusion models and then to autoregressive and multimodal approaches while analyzing principles, strengths, and future trends.

citing papers explorer

Showing 3 of 3 citing papers.

Towards 3D-Aware Video Diffusion Models: Render-Free Human Motion Control with Mesh Tokenization cs.CV · 2026-06-01 · unverdicted · none · ref 28
Introduces mesh tokenization to condition DiT-based video diffusion models directly on 3D human meshes for motion control without 2D rendering.
VHOI: Controllable Video Generation of Human-Object Interactions from Sparse Trajectories via Motion Densification cs.CV · 2025-12-10 · unverdicted · none · ref 53
VHOI densifies sparse trajectories into color-encoded HOI mask sequences and conditions a fine-tuned video diffusion model on them to produce controllable human-object interaction videos, including full navigation sequences.
Evolution of Video Generative Foundations cs.CV · 2026-04-07 · unverdicted · none · ref 274
This survey traces video generation technology from GANs to diffusion models and then to autoregressive and multimodal approaches while analyzing principles, strengths, and future trends.

Mimo: Controllable character video synthesis with spatial decomposed modeling.arXiv preprint arXiv:2409.16160

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer