Toward Calibrated, Fair, and accurate Deepfake Detection

Chris Russell; Ryan Brown

read the original abstract

Deepfake detectors show large performance gaps across demographic groups. Existing fairness approaches require demographic labels, retraining, or sacrifice accuracy. We introduce Face-Fairness (FF), a plug-and-play framework for bias mitigation. Our primary contribution, Face-Feature Tuning (FFT), is the first demographic label-free fairness method demonstrated for deepfake detection: a lightweight calibrator that performs a logit remapping conditioned on frozen face embeddings. We complement FFT with two variants: FF-Max, which maximizes worst-group accuracy when demographics are available, and FF-Discover, which does the same with embedding-discovered groups. Across in-domain and cross-dataset test settings, FF consistently reduces FPR/TPR gaps and improves minimum group accuracy while maintaining (often improving) overall accuracy. The approach is detector-agnostic, adds negligible runtime overhead, and requires no access to identity attributes.

Toward Calibrated, Fair, and accurate Deepfake Detection

discussion (0)