Introduces the LDD task, ListenForge dataset built from five listening head generation methods, and MANet model that detects listening forgeries via motion inconsistencies guided by audio semantics.
Asvspoof 2021: Automatic speaker verification spoofing and countermeasures challenge evaluation plan
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
Phoneme-level analysis using self-supervised embeddings identifies higher divergence in complex vowels and fricatives for emotional voice conversion deepfakes, enabling more interpretable detection across emotions.
A dual-branch fusion model with XLS-R, BEATs, Matching Head, and cross-attention achieves 70.20% F1-score and 16.54% environmental EER on CompSpoofV2, outperforming the baseline for component-level deepfake detection.
citing papers explorer
-
Listening Deepfake Detection: A New Perspective Beyond Speaking-Centric Forgery Analysis
Introduces the LDD task, ListenForge dataset built from five listening head generation methods, and MANet model that detects listening forgeries via motion inconsistencies guided by audio semantics.
-
Phoneme-Level Deepfake Detection Across Emotional Conditions Using Self-Supervised Embeddings
Phoneme-level analysis using self-supervised embeddings identifies higher divergence in complex vowels and fricatives for emotional voice conversion deepfakes, enabling more interpretable detection across emotions.
-
Deepfake Audio Detection Using Self-supervised Fusion Representations
A dual-branch fusion model with XLS-R, BEATs, Matching Head, and cross-attention achieves 70.20% F1-score and 16.54% environmental EER on CompSpoofV2, outperforming the baseline for component-level deepfake detection.