DanceCrafter generates high-fidelity, text-controlled dance sequences using a new Choreographic Syntax framework and a large fine-grained motion dataset.
arXiv preprint arXiv:2512.23464 , year=
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6representative citing papers
SentiAvatar generates expressive interactive 3D avatars in real time by combining a 37-hour mocap dialogue dataset with a pre-trained motion foundation model and an audio-aware plan-then-infill architecture that separates semantic planning from prosody-driven frame interpolation.
CoMoVi co-generates 3D human motions and 2D videos synchronously in a single diffusion denoising loop using 3D-to-2D projection and dual-branch diffusion with 3D-2D cross attentions.
AnyAct generates editable human reenactments from character videos via conditional motion generation from transferable sparse local 2D articulated cues, with designs for human-only supervision and global-local decoupling.
IAM jointly synthesizes motion sequences and body shape parameters conditioned on multimodal identity signals to achieve more realistic and identity-consistent human motions.
citing papers explorer
-
DanceCrafter: Fine-Grained Text-Driven Controllable Dance Generation via Choreographic Syntax
DanceCrafter generates high-fidelity, text-controlled dance sequences using a new Choreographic Syntax framework and a large fine-grained motion dataset.
-
SentiAvatar: Towards Expressive and Interactive Digital Humans
SentiAvatar generates expressive interactive 3D avatars in real time by combining a 37-hour mocap dialogue dataset with a pre-trained motion foundation model and an audio-aware plan-then-infill architecture that separates semantic planning from prosody-driven frame interpolation.
-
CoMoVi: Co-Generation of 3D Human Motions and Realistic Videos
CoMoVi co-generates 3D human motions and 2D videos synchronously in a single diffusion denoising loop using 3D-to-2D projection and dual-branch diffusion with 3D-2D cross attentions.
-
AnyAct: Towards Human Reenactment of Character Motion From Video
AnyAct generates editable human reenactments from character videos via conditional motion generation from transferable sparse local 2D articulated cues, with designs for human-only supervision and global-local decoupling.
-
IAM: Identity-Aware Human Motion and Shape Joint Generation
IAM jointly synthesizes motion sequences and body shape parameters conditioned on multimodal identity signals to achieve more realistic and identity-consistent human motions.
- SCRIPT: Scalable Diffusion Policy with Multi-stage Training for Language-driven Physics-based Humanoid Control