Recognition: unknown
ID-Eraser: Proactive Defense Against Face Swapping via Identity Perturbation
Pith reviewed 2026-05-09 21:47 UTC · model grok-4.3
The pith
ID-Eraser blocks face swapping by injecting perturbations into identity embeddings and reconstructing natural-looking images that carry no usable identity signal for deepfake models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ID-Eraser removes identifiable facial information by injecting perturbations directly into identity embeddings and then reconstructing the images with a Face Revive Generator. The resulting protection images remain visually realistic yet render the original identity unusable for downstream face swapping and recognition systems, even when those systems are unseen during training.
What carries the argument
The identity perturbation step followed by the Face Revive Generator, which adds targeted noise to embeddings to erase usable identity information and then reconstructs natural images from the altered embeddings.
If this is right
- Swaps generated from protected images drop to an average identity similarity of 0.504 across five representative face swapping models.
- The method records the lowest Top-1 identity recognition accuracy of 0.30 while maintaining the best FID of 1.64 and LPIPS of 0.020.
- Performance remains effective under common image distortions and on commercial APIs such as Tencent, where similarity falls from 0.76 to 0.36.
- The defense shows strong cross-dataset generalization without retraining.
Where Pith is reading between the lines
- The same embedding-level erasure could be applied to protect against other generative manipulations such as face reenactment or attribute editing.
- Widespread use might require embedding the generator in camera apps or social platforms so users can protect photos before upload.
- The approach opens the possibility of similar feature-space defenses for other biometric data like voice or gait.
Load-bearing premise
Perturbations added to identity embeddings can be turned back into images that carry no usable identity signal for any unseen swapping model while still looking identical to the originals to human eyes, even after real-world distortions.
What would settle it
A new commercial or open-source face swapping model that produces identity similarity scores above 0.6 when run on the protected images would show the defense does not hold.
Figures
read the original abstract
Deepfake technologies have rapidly advanced with modern generative AI, and face swapping in particular poses serious threats to privacy and digital security. Existing proactive defenses mostly rely on pixel-level perturbations, which are ineffective against contemporary swapping models that extract robust high-level identity embeddings. We propose ID-Eraser, a feature-space proactive defense that removes identifiable facial information to prevent malicious face swapping. By injecting learnable perturbations into identity embeddings and reconstructing natural-looking protection images through a Face Revive Generator (FRG), ID-Eraser produces visually realistic results for humans while rendering the protected identities unusable for Deepfake models. Experiments show that ID-Eraser substantially disrupts identity recognition across diverse face recognition and swapping systems under strict black-box settings, achieving the lowest Top-1 accuracy (0.30) with the best FID (1.64) and LPIPS (0.020). Compared with swaps generated from clean inputs, the identity similarity of protected swaps drops sharply to an average of 0.504 across five representative face swapping models. ID-Eraser further demonstrates strong cross-dataset generalization, robustness to common distortions, and practical effectiveness on commercial APIs, reducing Tencent API similarity from 0.76 to 0.36.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ID-Eraser, a feature-space proactive defense against face swapping. It injects learnable perturbations into identity embeddings extracted by a surrogate model and reconstructs the image via a Face Revive Generator (FRG) to produce visually realistic outputs that retain no usable identity signal for downstream deepfake models. Under black-box settings, experiments report that protected images yield the lowest Top-1 accuracy (0.30), best FID (1.64) and LPIPS (0.020), and reduce average identity similarity to 0.504 across five swapping models plus a commercial API, with additional claims of cross-dataset generalization and robustness to distortions.
Significance. If the central empirical claims hold, the work offers a meaningful advance over pixel-level perturbation defenses by operating directly on identity embeddings, achieving superior visual fidelity while disrupting recognition. The breadth of evaluation (multiple FR models, Tencent API, distortions, cross-dataset tests) and the introduction of the FRG reconstruction step provide concrete evidence that feature-space erasure can be practical. These strengths would position the paper as a useful contribution to proactive deepfake privacy research.
major comments (3)
- [§4 Experiments] §4 Experiments: The headline quantitative results (Top-1 accuracy 0.30, average similarity 0.504, FID 1.64, LPIPS 0.020) are presented without any description of training procedures for the learnable perturbation parameters, FRG architecture and loss, dataset splits, or statistical significance testing. These omissions are load-bearing because the central claim of effective black-box identity erasure cannot be verified or reproduced from the given text.
- [§3.1 and §3.2] §3.1 Identity Perturbation and §3.2 Face Revive Generator: The transferability argument—that perturbations optimized on one surrogate embedding space plus FRG reconstruction render the image unusable for arbitrary unseen swapping models—is not supported by any analysis showing that the FRG output lies far from the original identity in every possible embedding space. Different FR models employ distinct loss landscapes and feature hierarchies; the reported drops are consistent with transfer to related architectures but do not demonstrate model-agnostic erasure.
- [§4 Experiments] §4 Experiments: No ablation studies are reported on perturbation magnitude, FRG hyperparameters, or surrogate choice. Without these, it is impossible to determine whether the observed performance (e.g., similarity drop from clean to protected swaps) is robust or sensitive to specific implementation choices, undermining the generalization claims.
minor comments (2)
- [Abstract and §4] The abstract and §4 would benefit from a concise statement of the exact evaluation protocol for 'Top-1 accuracy' and 'identity similarity' (e.g., number of gallery identities, distance metric, and whether the same surrogate is used for all tests).
- [§3] Notation for the perturbation parameters and FRG components should be introduced once with consistent symbols; currently the text alternates between descriptive phrases and implicit references.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, providing honest responses and indicating where revisions will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: [§4 Experiments] The headline quantitative results (Top-1 accuracy 0.30, average similarity 0.504, FID 1.64, LPIPS 0.020) are presented without any description of training procedures for the learnable perturbation parameters, FRG architecture and loss, dataset splits, or statistical significance testing. These omissions are load-bearing because the central claim of effective black-box identity erasure cannot be verified or reproduced from the given text.
Authors: We agree that these details are essential for reproducibility and were insufficiently described in the original submission. In the revised manuscript, we will expand §4 with a new subsection detailing the training procedures for the learnable perturbation parameters (including optimization method, learning rate, and epochs), the full FRG architecture and loss functions, dataset splits (e.g., from FFHQ or VGGFace2), and statistical significance measures such as standard deviations over multiple random seeds. This will directly address the verifiability concern. revision: yes
-
Referee: [§3.1 and §3.2] The transferability argument—that perturbations optimized on one surrogate embedding space plus FRG reconstruction render the image unusable for arbitrary unseen swapping models—is not supported by any analysis showing that the FRG output lies far from the original identity in every possible embedding space. Different FR models employ distinct loss landscapes and feature hierarchies; the reported drops are consistent with transfer to related architectures but do not demonstrate model-agnostic erasure.
Authors: We acknowledge that the manuscript does not include a theoretical analysis proving the FRG output is distant from the original identity in every conceivable embedding space, as such exhaustive coverage is impractical. Our claims rest on strong empirical transferability demonstrated across five diverse swapping models plus the Tencent API. In revision, we will update §3.1 and §3.2 to explicitly state the limitations of the transferability argument, clarify that it is supported by representative model diversity rather than universality, and discuss why identity embeddings share sufficient common structure for practical erasure. revision: partial
-
Referee: [§4 Experiments] No ablation studies are reported on perturbation magnitude, FRG hyperparameters, or surrogate choice. Without these, it is impossible to determine whether the observed performance (e.g., similarity drop from clean to protected swaps) is robust or sensitive to specific implementation choices, undermining the generalization claims.
Authors: We agree that the absence of ablations limits assessment of robustness. In the revised manuscript, we will add ablation studies in §4 (or an appendix) examining perturbation magnitude, FRG loss hyperparameters, and surrogate model variations, with corresponding quantitative results on identity similarity and image quality metrics. These will demonstrate that the reported performance holds across reasonable ranges of these choices. revision: yes
Circularity Check
No circularity: empirical method with external validation
full rationale
The paper describes a feature-space defense (ID-Eraser) that perturbs identity embeddings and reconstructs images via a trained Face Revive Generator. All headline claims are empirical measurements of identity similarity, Top-1 accuracy, FID, and LPIPS on held-out face recognition and swapping models plus a commercial API. No equations, uniqueness theorems, or self-citations are invoked to derive the transferability result; the reported drops (e.g., similarity 0.504, Tencent 0.36) are presented as measured outcomes on external systems rather than quantities forced by construction or prior self-work. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- learnable perturbation parameters
axioms (1)
- domain assumption Perturbed identity embeddings can be mapped back to natural images that retain no usable identity signal for black-box models
invented entities (1)
-
Face Revive Generator (FRG)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Rethinking the up-sampling operations in cnn-based generative network for generalizable deepfake detection
Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, and Yunchao Wei. Rethinking the up-sampling operations in cnn-based generative network for generalizable deepfake detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 28130–28139, 2024
2024
-
[2]
Laa-net: Localized artifact attention network for quality-agnostic and generalizable deepfake detection
Dat Nguyen, Nesryne Mejri, Inder Pal Singh, Polina Kuleshova, Marcella Astrid, Anis Kacem, Enjie Ghorbel, and Djamila Aouada. Laa-net: Localized artifact attention network for quality-agnostic and generalizable deepfake detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion, pages 17395–17405, 2024
2024
-
[3]
Speechforensics: Audio-visual speech repre- sentation learning for face forgery detection.Advances in Neural Information Processing Systems, 37:86124–86144, 2024
Yachao Liang, Min Yu, Gang Li, Jianguo Jiang, Boquan Li, Feng Yu, Ning Zhang, Xiang Meng, and Weiqing Huang. Speechforensics: Audio-visual speech repre- sentation learning for face forgery detection.Advances in Neural Information Processing Systems, 37:86124–86144, 2024
2024
-
[4]
C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection
Chuangchuang Tan, Renshuai Tao, Huan Liu, Guanghua Gu, Baoyuan Wu, Yao Zhao, and Yunchao Wei. C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 7184–7192, 2025
2025
-
[5]
Peipeng Yu, Jianwei Fei, Hui Gao, Xuan Feng, Zhihua Xia, and Chip Hong Chang. Unlocking the capabilities of large vision-language models for generalizable and explainable deepfake detection.arXiv preprint arXiv:2503.14853, 2025
-
[6]
Vigo: Audiovisual fake detection and segment localization
Diego Pérez-Vieites, Juan José Moreira-Pérez, Ángel Aragón-Kifute, Raquel Román-Sarmiento, and Rubén Castro-González. Vigo: Audiovisual fake detection and segment localization. InProceedings of the 32nd ACM International Conference on Multimedia, pages 11360–11364, 2024
2024
-
[7]
Seeing is living? rethinking the security of facial liveness verification in the deepfake era
Changjiang Li, Li Wang, Shouling Ji, Xuhong Zhang, Zhaohan Xi, Shanqing Guo, and Ting Wang. Seeing is living? rethinking the security of facial liveness verification in the deepfake era. In31st USENIX Security Symposium (USENIX Security 22), pages 2673–2690, 2022
2022
-
[8]
Id-guard: A universal framework for combating facial manipulation via breaking identifi- cation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
Zuomin Qu, Wei Lu, Xiangyang Luo, Qian Wang, and Xiaochun Cao. Id-guard: A universal framework for combating facial manipulation via breaking identifi- cation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
2025
-
[9]
Ini- tiative defense against facial manipulation
Qidong Huang, Jie Zhang, Wenbo Zhou, Weiming Zhang, and Nenghai Yu. Ini- tiative defense against facial manipulation. InProceedings of the AAAI conference on artificial intelligence, volume 35, pages 1619–1627, 2021
2021
-
[10]
Zuomin Qu, Zuping Xi, Wei Lu, Xiangyang Luo, Qian Wang, and Bin Li. Df-rap: A robust adversarial perturbation for defending against deepfakes in real-world social network scenarios.IEEE Transactions on Information Forensics and Security, 19:3943–3957, 2024
2024
-
[11]
Run Wang, Ziheng Huang, Zhikai Chen, Li Liu, Jing Chen, and Lina Wang. Anti- forgery: Towards a stealthy and robust deepfake disruption attack via adversarial perceptual-aware perturbations.arXiv preprint arXiv:2206.00477, 2022
-
[12]
Defending fake via warning: Universal proactive defense against face manipulation.IEEE Signal Processing Letters, 30:1072–1076, 2023
Rui Zhai, Rongrong Ni, Yu Chen, Yang Yu, and Yao Zhao. Defending fake via warning: Universal proactive defense against face manipulation.IEEE Signal Processing Letters, 30:1072–1076, 2023
2023
-
[13]
Visually maintained image disturbance against deepfake face swapping
Junhao Dong and Xiaohua Xie. Visually maintained image disturbance against deepfake face swapping. In2021 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2021
2021
-
[14]
Restricted black- box adversarial attack against deepfake face swapping.IEEE Transactions on Information Forensics and Security, 18:2596–2608, 2023
Junhao Dong, Yuan Wang, Jianhuang Lai, and Xiaohua Xie. Restricted black- box adversarial attack against deepfake face swapping.IEEE Transactions on Information Forensics and Security, 18:2596–2608, 2023
2023
-
[15]
Simswap: An efficient framework for high fidelity face swapping
Renwang Chen, Xuanhong Chen, Bingbing Ni, and Yanhao Ge. Simswap: An efficient framework for high fidelity face swapping. InProceedings of the 28th ACM international conference on multimedia, pages 2003–2011, 2020
2003
-
[16]
Information bottleneck disentanglement for identity swapping
Gege Gao, Huaibo Huang, Chaoyou Fu, Zhaoyang Li, and Ran He. Information bottleneck disentanglement for identity swapping. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3404–3413, 2021
2021
-
[17]
Region- aware face swapping
Chao Xu, Jiangning Zhang, Miao Hua, Qian He, Zili Yi, and Yong Liu. Region- aware face swapping. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7632–7641, 2022
2022
-
[18]
Blendface: Re-designing identity encoders for face-swapping
Kaede Shiohara, Xingchao Yang, and Takafumi Taketomi. Blendface: Re-designing identity encoders for face-swapping. InProceedings of the IEEE/CVF international ID-Eraser: Proactive Defense Against Face Swapping via Identity Perturbation Conference’17, July 2017, Washington, DC, USA conference on computer vision, pages 7634–7644, 2023
2017
-
[19]
Fine-grained face swapping via regional gan inversion
Zhian Liu, Maomao Li, Yong Zhang, Cairong Wang, Qi Zhang, Jue Wang, and Yongwei Nie. Fine-grained face swapping via regional gan inversion. InProceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8578–8587, 2023
2023
-
[20]
Face-adapter for pre-trained diffusion models with fine-grained id and attribute control
Yue Han, Junwei Zhu, Keke He, Xu Chen, Yanhao Ge, Wei Li, Xiangtai Li, Jiangn- ing Zhang, Chengjie Wang, and Yong Liu. Face-adapter for pre-trained diffusion models with fine-grained id and attribute control. InEuropean Conference on Computer Vision, pages 20–36. Springer, 2024
2024
-
[21]
Canonswap: High-fidelity and consistent video face swapping via canonical space modulation
Xiangyang Luo, Ye Zhu, Yunfei Liu, Lijian Lin, Cong Wan, Zijian Cai, Yu Li, and Shao-Lun Huang. Canonswap: High-fidelity and consistent video face swapping via canonical space modulation. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 10064–10074, 2025
2025
-
[22]
Cmua-watermark: A cross- model universal adversarial watermark for combating deepfakes
Hao Huang, Yongtao Wang, Zhaoyu Chen, Yuze Zhang, Yuheng Li, Zhi Tang, Wei Chu, Jingdong Chen, Weisi Lin, and Kai-Kuang Ma. Cmua-watermark: A cross- model universal adversarial watermark for combating deepfakes. InProceedings of the AAAI conference on artificial intelligence, volume 36, pages 989–997, 2022
2022
-
[23]
Arcface: Additive angular margin loss for deep face recognition
Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. Arcface: Additive angular margin loss for deep face recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4690–4699, 2019
2019
-
[24]
arXiv preprint arXiv:1912.13457 , year=
Lingzhi Li, Jianmin Bao, Hao Yang, Dong Chen, and Fang Wen. Faceshifter: Towards high fidelity and occlusion aware face swapping.arXiv preprint arXiv:1912.13457, 2019
-
[25]
Diffswap: High-fidelity and controllable face swapping via 3d-aware masked diffusion
Wenliang Zhao, Yongming Rao, Weikang Shi, Zuyan Liu, Jie Zhou, and Jiwen Lu. Diffswap: High-fidelity and controllable face swapping via 3d-aware masked diffusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8568–8577, 2023
2023
- [26]
-
[27]
Swin transformer: Hierarchical vision transformer using shifted windows
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. InICCV, 2021
2021
-
[28]
The unreasonable effectiveness of deep features as a perceptual metric
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InCVPR, 2018
2018
-
[29]
Lau, Zhen Wang, and Stephen Paul Smolley
Xudong Mao, Qing Li, Haoran Xie, Raymond Y.K. Lau, Zhen Wang, and Stephen Paul Smolley. Least squares generative adversarial networks. InPro- ceedings of the IEEE International Conference on Computer Vision (ICCV), pages 2813–2821, 2017
2017
-
[30]
Nullswap: Proactive identity cloaking against deepfake face swapping
Tianyi Wang, Shuaicheng Niu, Harry Cheng, Xiao Zhang, and Yinglong Wang. Nullswap: Proactive identity cloaking against deepfake face swapping. InPro- ceedings of the IEEE/CVF International Conference on Computer Vision, pages 9945–9954, 2025
2025
-
[31]
Facenet: A unified em- bedding for face recognition and clustering
Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified em- bedding for face recognition and clustering. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 815–823, 2015
2015
-
[32]
Deep face recognition
Omkar Parkhi, Andrea Vedaldi, and Andrew Zisserman. Deep face recognition. InBMVC 2015-Proceedings of the British Machine Vision Conference 2015. British Machine Vision Association, 2015
2015
-
[33]
Sface: Privacy-friendly and accurate face recognition using synthetic data
Fadi Boutros, Marco Huber, Patrick Siebke, Tim Rieber, and Naser Damer. Sface: Privacy-friendly and accurate face recognition using synthetic data. In2022 IEEE International Joint Conference on Biometrics (IJCB), pages 1–11. IEEE, 2022
2022
-
[34]
Progressive Growing of GANs for Improved Quality, Stability, and Variation
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation.arXiv preprint arXiv:1710.10196, 2017
work page internal anchor Pith review arXiv 2017
-
[35]
A style-based generator architecture for generative adversarial net- works
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. A style-based generator architecture for generative adversarial net- works. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019
2019
-
[36]
Adam: A Method for Stochastic Optimization
Diederik P Kingma. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[37]
Designing one unified framework for high-fidelity face reenactment and swapping
Chao Xu, Jiangning Zhang, Yue Han, Guanzhong Tian, Xianfang Zeng, Ying Tai, Yabiao Wang, Chengjie Wang, and Yong Liu. Designing one unified framework for high-fidelity face reenactment and swapping. InEuropean conference on computer vision, pages 54–71. Springer, 2022
2022
-
[38]
Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller
Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst, 2007
2007
-
[39]
Vg- gface2: A dataset for recognising faces across pose and age
Qiong Cao, Li Shen, Weidi Xie, Omkar M Parkhi, and Andrew Zisserman. Vg- gface2: A dataset for recognising faces across pose and age. In2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pages 67–74. IEEE, 2018
2018
-
[40]
Faceforensics++: Learning to detect manipulated facial images
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. Faceforensics++: Learning to detect manipulated facial images. InProceedings of the IEEE/CVF international conference on computer vision, pages 1–11, 2019
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.