Introduces SHAC task, CARDIOFAKE dataset, and GROOT fusion model achieving SOTA detection of neural codec-synthesized phonocardiograms using MFCC and WavLM features.
Towards Detecting Neural Audio Codec Synthesized Heart Sounds
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
In this paper, we introduce Synthetic Heart Sound Detection (SHAC), a task aimed at identifying phonocardiograms (PCGs) synthesized using neural audio codecs (NACs). To facilitate research in this direction, we release CARDIOFAKE, the first benchmark dataset for SHAC containing both real and codec-synthesized PCGs. We benchmark spectral representations (MFCC, LFCC) and self-supervised learning (SSL) representations (e.g., WavLM) for the task. Furthermore, we propose GROOT, a fusion framework that integrates spectral and SSL features for leveraging their complementary behavior. Experiments show that GROOT, combining MFCC and WavLM, achieves state-of-the-art performance, outperforming individual representations and competitive baselines.
fields
eess.AS 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Towards Detecting Neural Audio Codec Synthesized Heart Sounds
Introduces SHAC task, CARDIOFAKE dataset, and GROOT fusion model achieving SOTA detection of neural codec-synthesized phonocardiograms using MFCC and WavLM features.