A gated fusion of XLSR-53 and CORES features with energy margin and diversity losses reaches 97.6% ID accuracy and reduces FPR95 by 83.5% relative to the Interspeech 2025 baseline on MLAAD.
Multilingual Source Tracing of Speech Deepfakes: A First Benchmark,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Proxy-Anchor metric learning on Wav2Vec2-BERT embeddings with architecture merging achieves 99.76% closed-set accuracy and 2.04% FPR@95 OOD detection on MLAAD v9, doubling prior OOD accuracy on v5 splits.
citing papers explorer
-
Dual-Branch Gated Fusion for Open-Set Audio Deepfake Source Tracing
A gated fusion of XLSR-53 and CORES features with energy margin and diversity losses reaches 97.6% ID accuracy and reduces FPR95 by 83.5% relative to the Interspeech 2025 baseline on MLAAD.
-
Anchoring the Unknown: Open-Set Model Attribution via Proxy-Anchor Learning
Proxy-Anchor metric learning on Wav2Vec2-BERT embeddings with architecture merging achieves 99.76% closed-set accuracy and 2.04% FPR@95 OOD detection on MLAAD v9, doubling prior OOD accuracy on v5 splits.