UMo presents a sparse MoE-based unified model for real-time co-speech avatar animation that claims superior quality under latency constraints via keyframe-centric design and multi-stage audio-augmented training.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.GR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
UMo: Unified Sparse Motion Modeling for Real-Time Co-Speech Avatars
UMo presents a sparse MoE-based unified model for real-time co-speech avatar animation that claims superior quality under latency constraints via keyframe-centric design and multi-stage audio-augmented training.