CoMelSinger introduces a discrete token-based zero-shot SVS framework on MaskGCT with coarse-to-fine contrastive learning and an SVT module to improve melody control and reduce prosody leakage.
Singmos: An extensive open- source singing voice dataset for mos prediction,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SD 2verdicts
UNVERDICTED 2representative citing papers
MOS-Bench benchmark shows that existing SSQA models struggle with out-of-domain generalization and that training on multiple diverse datasets improves robustness.
citing papers explorer
-
CoMelSinger: Discrete Token-Based Zero-Shot Singing Synthesis With Structured Melody Control and Guidance
CoMelSinger introduces a discrete token-based zero-shot SVS framework on MaskGCT with coarse-to-fine contrastive learning and an SVT module to improve melody control and reduce prosody leakage.
-
MOS-Bench: Benchmarking Generalization Abilities of Subjective Speech Quality Assessment Models
MOS-Bench benchmark shows that existing SSQA models struggle with out-of-domain generalization and that training on multiple diverse datasets improves robustness.