Unified guidance framework for Flow Matching speech synthesis achieves nearly 3x faster inference and improved speaker similarity by combining heterogeneous data augmentation with intrinsic model guidance to eliminate CFG overhead.
Comospeech: One-step speech and singing voice synthesis via consistency model,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Enhancing Flow Matching with A Unified Guidance Framework for Efficient and Robust Speech Synthesis
Unified guidance framework for Flow Matching speech synthesis achieves nearly 3x faster inference and improved speaker similarity by combining heterogeneous data augmentation with intrinsic model guidance to eliminate CFG overhead.