Megadance: Mixture-of-experts architecture for genre-aware 3d dance generation

Yang, K · 2025 · arXiv 2505.17543

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

representative citing papers

OmniDance: Multimodal Driven Dance Video Generation with Large-scale Internet Data

cs.CV · 2026-06-29 · unverdicted · novelty 7.0

Introduces CIPE-Dance as the largest dance video dataset and OmniDance framework for unified text-music multimodal dance video generation achieving SOTA on TI2V, MI2V, and MTI2V tasks.

Interactive Multi-Turn Retrieval for Health Videos

cs.IR · 2026-05-02 · unverdicted · novelty 6.0

DATR combines coarse CLIP-based retrieval with multi-turn query fusion and cross-encoder re-ranking to improve health video retrieval, supported by the new MHVRC corpus.

CustomDancer: Customized Dance Recommendation by Text-Dance Retrieval

cs.MM · 2026-05-01 · unverdicted · novelty 6.0

CustomDancer achieves state-of-the-art text-to-dance retrieval with 10.23% Recall@1 on the new TD-Data dataset by aligning text, music, and motion features through a CLIP-based framework.

PianoFlow: Music-Aware Streaming Piano Motion Generation with Bimanual Coordination

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

PianoFlow generates coordinated bimanual piano motions from audio via MIDI-distilled flow-matching, asymmetric role-gated interaction, and autoregressive streaming continuation, outperforming priors with 9x faster inference.

BiTDiff: Fine-Grained 3D Conducting Motion Generation via BiMamba-Transformer Diffusion

cs.CV · 2026-04-06 · unverdicted · novelty 6.0

BiTDiff combines BiMamba-Transformer architecture with diffusion and human-kinematic decomposition to generate high-quality 3D conducting motions from music, achieving SOTA results on the new CM-Data dataset.

Listen to Rhythm, Choose Movements: Autoregressive Multimodal Dance Generation via Diffusion and Mamba with Decoupled Dance Dataset

cs.GR · 2026-01-06 · unverdicted · novelty 6.0

LRCM is a new multimodal diffusion model with audio and text Conformers plus Motion Temporal Mamba for generating long, coherent dance sequences from rhythm and descriptions using a decoupled dataset.

Embedding-perturbed Exploration Preference Optimization for Flow Models

cs.CV · 2026-05-15 · unverdicted · novelty 5.0

E²PO uses embedding-level perturbations to maintain intra-group variance and discriminative signal in RL-based preference optimization for generative flow models.

MG-Former: A Transformer-Based Framework for Music-Driven 3D Conducting Gesture Generation

cs.SD · 2026-05-02 · unverdicted · novelty 5.0

TransConductor generates 3D conducting gestures from music via a Trans-Temporal Music Encoder and Gesture Decoder, outperforming baselines on retrieval-based alignment metrics with a new ConductorMotion dataset.

citing papers explorer

Showing 8 of 8 citing papers.

OmniDance: Multimodal Driven Dance Video Generation with Large-scale Internet Data cs.CV · 2026-06-29 · unverdicted · none · ref 42
Introduces CIPE-Dance as the largest dance video dataset and OmniDance framework for unified text-music multimodal dance video generation achieving SOTA on TI2V, MI2V, and MTI2V tasks.
Interactive Multi-Turn Retrieval for Health Videos cs.IR · 2026-05-02 · unverdicted · none · ref 34
DATR combines coarse CLIP-based retrieval with multi-turn query fusion and cross-encoder re-ranking to improve health video retrieval, supported by the new MHVRC corpus.
CustomDancer: Customized Dance Recommendation by Text-Dance Retrieval cs.MM · 2026-05-01 · unverdicted · none · ref 29
CustomDancer achieves state-of-the-art text-to-dance retrieval with 10.23% Recall@1 on the new TD-Data dataset by aligning text, music, and motion features through a CLIP-based framework.
PianoFlow: Music-Aware Streaming Piano Motion Generation with Bimanual Coordination cs.CV · 2026-04-14 · unverdicted · none · ref 71
PianoFlow generates coordinated bimanual piano motions from audio via MIDI-distilled flow-matching, asymmetric role-gated interaction, and autoregressive streaming continuation, outperforming priors with 9x faster inference.
BiTDiff: Fine-Grained 3D Conducting Motion Generation via BiMamba-Transformer Diffusion cs.CV · 2026-04-06 · unverdicted · none · ref 7
BiTDiff combines BiMamba-Transformer architecture with diffusion and human-kinematic decomposition to generate high-quality 3D conducting motions from music, achieving SOTA results on the new CM-Data dataset.
Listen to Rhythm, Choose Movements: Autoregressive Multimodal Dance Generation via Diffusion and Mamba with Decoupled Dance Dataset cs.GR · 2026-01-06 · unverdicted · none · ref 40
LRCM is a new multimodal diffusion model with audio and text Conformers plus Motion Temporal Mamba for generating long, coherent dance sequences from rhythm and descriptions using a decoupled dataset.
Embedding-perturbed Exploration Preference Optimization for Flow Models cs.CV · 2026-05-15 · unverdicted · none · ref 91
E²PO uses embedding-level perturbations to maintain intra-group variance and discriminative signal in RL-based preference optimization for generative flow models.
MG-Former: A Transformer-Based Framework for Music-Driven 3D Conducting Gesture Generation cs.SD · 2026-05-02 · unverdicted · none · ref 38
TransConductor generates 3D conducting gestures from music via a Trans-Temporal Music Encoder and Gesture Decoder, outperforming baselines on retrieval-based alignment metrics with a new ConductorMotion dataset.

Megadance: Mixture-of-experts architecture for genre-aware 3d dance generation

fields

years

verdicts

representative citing papers

citing papers explorer