HandsOnWorld creates a hand-controlled egocentric video generator from unconstrained monocular video via a new EgoVid-Pro dataset from monocular reconstruction and a Plücker Hand Map that disentangles camera and hand motion.
Tora: Trajectory-oriented diffusion transformer for video gener- ation
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Introduces mesh tokenization to condition DiT-based video diffusion models directly on 3D human meshes for motion control without 2D rendering.
citing papers explorer
-
HandsOnWorld: Unconstrained Egocentric Video Generation with Camera-Disentangled Hand Control
HandsOnWorld creates a hand-controlled egocentric video generator from unconstrained monocular video via a new EgoVid-Pro dataset from monocular reconstruction and a Plücker Hand Map that disentangles camera and hand motion.
-
Towards 3D-Aware Video Diffusion Models: Render-Free Human Motion Control with Mesh Tokenization
Introduces mesh tokenization to condition DiT-based video diffusion models directly on 3D human meshes for motion control without 2D rendering.