arXiv preprint arXiv:2511.03334 (2025)

Guozhen Zhang, Zixiang Zhou, Teng Hu, Ziqiao Peng, Youliang Zhang, Yi Chen, Yuan Zhou, Qinglin Lu, Limin Wang · 2025 · arXiv 2511.03334

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Native Audio-Visual Alignment for Generation

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

NAVA proposes native audio-visual alignment via Align-then-Fuse MMDiT and Timbre-in-Context Conditioning for joint audio-video generation with improved synchronization and timbre control.

MTAVG-Bench 2.0: Diagnosing Failure Modes of Cinematic Expressiveness in Multi-Talker Audio-Video Generation

cs.AI · 2026-05-27 · unverdicted · novelty 6.0

MTAVG-Bench 2.0 is a new benchmark that evaluates omni LLMs on diagnosing high-level cinematic failures in multi-talker audio-video generation using a taxonomy of acting, narrative, atmosphere, and audio-visual language.

Unison: Harmonizing Motion, Speech, and Sound for Human-Centric Audio-Video Generation

cs.CV · 2026-05-09 · unverdicted · novelty 6.0

Unison introduces a unified framework using semantic-guided harmonization and bidirectional cross-modal forcing to generate human-centric videos with improved synchronization between motion, speech, and sound effects.

Interactive Multi-Turn Retrieval for Health Videos

cs.IR · 2026-05-02 · unverdicted · novelty 6.0

DATR combines coarse CLIP-based retrieval with multi-turn query fusion and cross-encoder re-ranking to improve health video retrieval, supported by the new MHVRC corpus.

Mutual Forcing: Dual-Mode Self-Evolution for Fast Autoregressive Audio-Video Character Generation

cs.CV · 2026-04-28 · unverdicted · novelty 6.0

Mutual Forcing trains a single native autoregressive audio-video model with mutually reinforcing few-step and multi-step modes via self-distillation to match 50-step baselines at 4-8 steps.

OmniHuman: A Large-scale Dataset and Benchmark for Human-Centric Video Generation

cs.CV · 2026-04-20

citing papers explorer

Showing 6 of 6 citing papers.

Native Audio-Visual Alignment for Generation cs.CV · 2026-05-28 · unverdicted · none · ref 28
NAVA proposes native audio-visual alignment via Align-then-Fuse MMDiT and Timbre-in-Context Conditioning for joint audio-video generation with improved synchronization and timbre control.
MTAVG-Bench 2.0: Diagnosing Failure Modes of Cinematic Expressiveness in Multi-Talker Audio-Video Generation cs.AI · 2026-05-27 · unverdicted · none · ref 39
MTAVG-Bench 2.0 is a new benchmark that evaluates omni LLMs on diagnosing high-level cinematic failures in multi-talker audio-video generation using a taxonomy of acting, narrative, atmosphere, and audio-visual language.
Unison: Harmonizing Motion, Speech, and Sound for Human-Centric Audio-Video Generation cs.CV · 2026-05-09 · unverdicted · none · ref 44
Unison introduces a unified framework using semantic-guided harmonization and bidirectional cross-modal forcing to generate human-centric videos with improved synchronization between motion, speech, and sound effects.
Interactive Multi-Turn Retrieval for Health Videos cs.IR · 2026-05-02 · unverdicted · none · ref 38
DATR combines coarse CLIP-based retrieval with multi-turn query fusion and cross-encoder re-ranking to improve health video retrieval, supported by the new MHVRC corpus.
Mutual Forcing: Dual-Mode Self-Evolution for Fast Autoregressive Audio-Video Character Generation cs.CV · 2026-04-28 · unverdicted · none · ref 58
Mutual Forcing trains a single native autoregressive audio-video model with mutually reinforcing few-step and multi-step modes via self-distillation to match 50-step baselines at 4-8 steps.
OmniHuman: A Large-scale Dataset and Benchmark for Human-Centric Video Generation cs.CV · 2026-04-20 · unreviewed · ref 57

arXiv preprint arXiv:2511.03334 (2025)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer