Introduces the LDD task, ListenForge dataset built from five listening head generation methods, and MANet model that detects listening forgeries via motion inconsistencies guided by audio semantics.
Asvspoof2021:Automaticspeakerverificationspoofingandcountermeasureschallengeevaluationplan
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5roles
background 1polarities
background 1representative citing papers
Introduces SHAC task, CARDIOFAKE dataset, and GROOT fusion model achieving SOTA detection of neural codec-synthesized phonocardiograms using MFCC and WavLM features.
Phoneme-level analysis using self-supervised embeddings identifies higher divergence in complex vowels and fricatives for emotional voice conversion deepfakes, enabling more interpretable detection across emotions.
A dual-branch fusion model with XLS-R, BEATs, Matching Head, and cross-attention achieves 70.20% F1-score and 16.54% environmental EER on CompSpoofV2, outperforming the baseline for component-level deepfake detection.
The paper analyzes evolving security and safety threats in generative AI from content generation to agentic actions, noting that attack surfaces expand faster than defenses and that many safeguards require institutional coordination not yet in place.
citing papers explorer
-
Towards Detecting Neural Audio Codec Synthesized Heart Sounds
Introduces SHAC task, CARDIOFAKE dataset, and GROOT fusion model achieving SOTA detection of neural codec-synthesized phonocardiograms using MFCC and WavLM features.