A joint fullband-subband model using high-resolution 44.1 kHz audio outperforms standard 16 kHz detectors for singing voice deepfake detection by exploiting spectrum-specific synthesis artifacts.
Are music foundation models better at singing voice deepfake detection? far-better fuse them with speech foun- dation models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Joint Fullband-Subband Modeling for High-Resolution SingFake Detection
A joint fullband-subband model using high-resolution 44.1 kHz audio outperforms standard 16 kHz detectors for singing voice deepfake detection by exploiting spectrum-specific synthesis artifacts.