A zero-shot open-set speech deepfake source tracing framework using adapted SSL-AASIST embeddings and AAM loss achieves EER of 16.43% in OOD trials with cosine scoring, outperforming few-shot alternatives.
Advancing Zero-Shot Open-Set Speech Deepfake Source Tracing
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
We propose a novel zero-shot source tracing framework inspired by speaker verification. We adapt SSL-AASIST for attack classification, enhancing embeddings with AAM loss and RegMixup, and ensure that training attacks are disjoint from those forming fingerprint-trial pairs. For backend scoring in attack verification, we explore both zero-shot approaches (cosine similarity and Siamese) and few-shot approaches (MLP and Siamese). Experiments on our recently introduced STOPA dataset with an open set setting show that few-shot learning provides advantages in the in-distribution (ID) scenario, while zero-shot approaches perform better in the out-of-distribution (OOD) scenario. In attack source verification with ID trials, few-shot Siamese and MLP achieve equal error rates (EER) of 17.72% and 13.11%, compared to 29.91% for zero-shot cosine scoring. Conversely, in OOD trials, zero-shot cosine scoring reaches 16.43%, outperforming few-shot Siamese at 23.47% and MLP at 21.57%.
fields
eess.AS 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Advancing Zero-Shot Open-Set Speech Deepfake Source Tracing
A zero-shot open-set speech deepfake source tracing framework using adapted SSL-AASIST embeddings and AAM loss achieves EER of 16.43% in OOD trials with cosine scoring, outperforming few-shot alternatives.