Recognition: unknown
Face-D(²)CL: Multi-Domain Synergistic Representation with Dual Continual Learning for Facial DeepFake Detection
Pith reviewed 2026-05-10 18:05 UTC · model grok-4.3
The pith
A framework fuses spatial and frequency features with dual continual learning to let deepfake detectors adapt to new forgeries without forgetting or replaying old data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Face-D²CL uses multi-domain synergistic representation to fuse spatial and frequency-domain features for comprehensive capture of diverse forgery traces, together with a dual continual learning mechanism that applies elastic weight consolidation to distinguish parameter importance for real versus fake samples and orthogonal gradient constraint to ensure task-specific adapter updates do not interfere with previously learned knowledge.
What carries the argument
Multi-domain synergistic representation that fuses spatial and frequency-domain features, combined with the dual continual learning mechanism of elastic weight consolidation and orthogonal gradient constraint on task adapters.
If this is right
- The model reaches a dynamic balance between stability against forgetting and plasticity for new forgery types.
- Average detection error rates fall substantially relative to prior state-of-the-art methods.
- Detection performance on previously unseen forgery domains rises without requiring storage of past data.
- Task-specific adapters can be updated orthogonally while real-versus-fake parameter importance is preserved.
Where Pith is reading between the lines
- The same fusion of spatial and frequency cues could be tested on non-facial image forgery tasks such as document or video manipulation.
- Separately weighting parameters for real and fake classes might reduce forgetting in other binary continual-learning settings outside detection.
- The orthogonal update rule on adapters suggests a general way to limit interference in replay-free continual learning for other computer-vision problems.
- Extending the framework to video sequences would reveal whether the spatial-frequency synergy scales beyond static images.
Load-bearing premise
That fusing spatial and frequency-domain features will capture the full variety of forgery traces and that elastic weight consolidation paired with orthogonal gradient constraint will prevent forgetting while enabling adaptation without any replay of historical data.
What would settle it
Sequential training on a series of new forgery domains followed by re-testing on the earliest domains to check whether accuracy on those early domains drops sharply despite the proposed mechanisms.
Figures
read the original abstract
The rapid advancement of facial forgery techniques poses severe threats to public trust and information security, making facial DeepFake detection a critical research priority. Continual learning provides an effective approach to adapt facial DeepFake detection models to evolving forgery patterns. However, existing methods face two key bottlenecks in real-world continual learning scenarios: insufficient feature representation and catastrophic forgetting. To address these issues, we propose Face-D(^2)CL, a framework for facial DeepFake detection. It leverages multi-domain synergistic representation to fuse spatial and frequency-domain features for the comprehensive capture of diverse forgery traces, and employs a dual continual learning mechanism that combines Elastic Weight Consolidation (EWC), which distinguishes parameter importance for real versus fake samples, and Orthogonal Gradient Constraint (OGC), which ensures updates to task-specific adapters do not interfere with previously learned knowledge. This synergy enables the model to achieve a dynamic balance between robust anti-forgetting capabilities and agile adaptability to emerging facial forgery paradigms, all without relying on historical data replay. Extensive experiments demonstrate that our method surpasses current SOTA approaches in both stability and plasticity, achieving 60.7% relative reduction in average detection error rate, respectively. On unseen forgery domains, it further improves the average detection AUC by 7.9% compared to the current SOTA method.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Face-D²CL for continual facial DeepFake detection. It introduces multi-domain synergistic representation by fusing spatial and frequency-domain features to capture diverse forgery traces, paired with a dual continual learning mechanism: Elastic Weight Consolidation (EWC) that distinguishes parameter importance for real versus fake samples, and Orthogonal Gradient Constraint (OGC) on task-specific adapters to avoid interference with prior knowledge. The approach claims to enable adaptation to new forgery paradigms without historical data replay, achieving a 60.7% relative reduction in average detection error rate and a 7.9% average AUC improvement on unseen domains over current SOTA methods.
Significance. If the reported gains hold under rigorous validation, the work could meaningfully advance continual learning for security applications by offering a no-replay solution that balances stability and plasticity in the face of evolving forgery techniques. The specific pairing of EWC (real/fake importance) with OGC on adapters is a targeted contribution, but its effectiveness depends on empirical demonstration of low backward transfer across realistic domain sequences.
major comments (2)
- [Abstract] Abstract: The central empirical claims (60.7% relative error-rate reduction and 7.9% unseen-domain AUC lift) are presented without any reference to the datasets, the number or ordering of continual-learning domains, baseline implementations, ablation studies, or statistical tests. This absence prevents verification of whether the dual CL mechanism actually delivers the claimed synergy.
- [Method] Method (dual CL section): The combination of EWC (with real/fake importance weighting) and OGC (orthogonal updates on adapters) is asserted to prevent catastrophic forgetting without replay, yet no analysis addresses whether EWC's quadratic penalty remains valid when forgery artifacts shift between spatial and frequency domains; the skeptic concern that this may reduce plasticity for novel traces is not tested or bounded.
minor comments (2)
- [Abstract] The sentence ending 'achieving 60.7% relative reduction in average detection error rate, respectively' contains an extraneous 'respectively' with no antecedent list.
- The notation Face-D(^2)CL should be clarified in the title and introduction; it is unclear whether the superscript denotes 'dual' or another quantity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for highlighting areas where the presentation of our empirical claims and methodological analysis could be strengthened. We address each major comment point by point below, indicating where revisions to the manuscript are planned.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central empirical claims (60.7% relative error-rate reduction and 7.9% unseen-domain AUC lift) are presented without any reference to the datasets, the number or ordering of continual-learning domains, baseline implementations, ablation studies, or statistical tests. This absence prevents verification of whether the dual CL mechanism actually delivers the claimed synergy.
Authors: We agree that the abstract's brevity omits these specifics, which are essential for immediate verification. The full manuscript details the datasets (including FaceForensics++, Celeb-DF, and additional forgery sources for the continual sequences), the exact ordering of domains in the learning protocol, baseline re-implementations, comprehensive ablation studies isolating each component of the dual CL mechanism, and statistical tests supporting the reported gains. To improve accessibility, we will revise the abstract to concisely reference the experimental setup and direct readers to the Experiments and Ablation sections for full verification of the claimed synergy. revision: yes
-
Referee: [Method] Method (dual CL section): The combination of EWC (with real/fake importance weighting) and OGC (orthogonal updates on adapters) is asserted to prevent catastrophic forgetting without replay, yet no analysis addresses whether EWC's quadratic penalty remains valid when forgery artifacts shift between spatial and frequency domains; the skeptic concern that this may reduce plasticity for novel traces is not tested or bounded.
Authors: We appreciate this insightful concern about the interaction between EWC's penalty and cross-domain artifact shifts. Our ablation studies and unseen-domain evaluations empirically demonstrate that the dual mechanism (EWC with real/fake weighting plus OGC) preserves plasticity, as shown by the 7.9% AUC improvement on novel forgeries without replay. However, we acknowledge that the current version lacks an explicit analysis or bound quantifying any potential plasticity reduction under spatial-frequency shifts. We will add a dedicated discussion subsection in the revised Method or Experiments section, supported by additional targeted experiments measuring backward transfer and plasticity metrics across domain sequences, to directly address and bound this aspect. revision: partial
Circularity Check
No circularity: empirical combination of known techniques with experimental validation
full rationale
The paper proposes Face-D(^2)CL as a practical framework that fuses spatial/frequency features and applies EWC+OGC for continual learning without replay. No equations, derivations, or first-principles predictions appear in the provided text; performance gains (error-rate reduction, AUC improvement) are asserted solely via experiments on unseen domains. EWC and OGC are standard cited methods, not redefined or fitted in a self-referential loop within this work. The central claim therefore rests on empirical results rather than any reduction of outputs to inputs by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Jason Baldridge, Jakob Bauer, Mukul Bhutani, Nicole Brichtova, Andrew Bunner, Lluis Castrejon, Kelvin Chan, Yichang Chen, Sander Dieleman, Yuqing Du, et al
- [2]
-
[3]
Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajan- than, Puneet K Dokania, Philip HS Torr, and Marc’Aurelio Ranzato. 2019. On tiny episodic memories in continual learning.arXiv preprint arXiv:1902.10486(2019)
work page Pith review arXiv 2019
-
[4]
Jikang Cheng, Zhiyuan Yan, Ying Zhang, Li Hao, Jiaxin Ai, Qin Zou, Chen Li, and Zhongyuan Wang. 2025. Stacking brick by brick: Aligned feature isolation for incremental face forgery detection. InProceedings of the computer vision and pattern recognition conference. 13927–13936
2025
-
[5]
François Chollet. 2017. Xception: Deep learning with depthwise separable con- volutions. InProceedings of the IEEE conference on computer vision and pattern recognition. 1251–1258
2017
- [6]
-
[7]
Nick Dufour and Andrew Gully. 2019. Contributing data to deepfake detection research.Google AI Blog1, 2 (2019), 3
2019
-
[8]
Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. 2024. Scaling rectified flow transformers for high-resolution image synthesis. InForty- first international conference on machine learning
2024
-
[9]
Alexandros Haliassos, Konstantinos Vougioukas, Stavros Petridis, and Maja Pan- tic. 2021. Lips don’t lie: A generalisable and robust approach to face forgery detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5039–5049
2021
-
[10]
Yue-Hua Han, Tai-Ming Huang, Kai-Lung Hua, and Jun-Cheng Chen. 2025. To- wards more general video-based deepfake detection through facial component guided adaptation for foundation model. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 22995–23005
2025
-
[11]
Fa-Ting Hong and Dan Xu. 2023. Implicit identity representation conditioned memory compensation network for talking head video generation. InProceedings of the IEEE/CVF international conference on computer vision. 23062–23072
2023
-
[12]
Yongkang Hu, Yu Cheng, Yushuo Zhang, Yuan Xie, and Zhaoxia Yin. 2026. SAIDO: Generalizable Detection of AI-Generated Images via Scene-Aware and Importance-Guided Dynamic Optimization in Continual Learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition
2026
-
[13]
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive Growing of GANs for Improved Quality, Stability, and Variation. InProceedings of the international conference on learning representations
2018
-
[14]
Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2021. Alias-free generative adversarial networks.Ad- vances in neural information processing systems34 (2021), 852–863
2021
-
[15]
Minha Kim, Shahroz Tariq, and Simon S Woo. 2021. Cored: Generalizing fake media detection with continual representation using distillation. InProceedings of the 29th ACM international conference on multimedia. 337–346
2021
-
[16]
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks.Proceedings of the national academy of sciences114, 13 (2017), 3521– 3526
2017
-
[17]
Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, Fang Wen, and Bain- ing Guo. 2020. Face x-ray for more general face forgery detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5001–5010
2020
-
[18]
Yuezun Li, Ming-Ching Chang, and Siwei Lyu. 2018. In Ictu Oculi: Exposing AI Created Fake Videos by Detecting Eye Blinking. In2018 IEEE international workshop on information forensics and security. 1–7
2018
-
[19]
Zhizhong Li and Derek Hoiem. 2018. Learning without Forgetting.IEEE Trans- actions on pattern analysis and machine intelligence40, 12 (2018), 2935–2947
2018
-
[20]
Honggu Liu, Xiaodan Li, Wenbo Zhou, Yuefeng Chen, Yuan He, Hui Xue, Weiming Zhang, and Nenghai Yu. 2021. Spatial-phase shallow learning: rethinking face forgery detection in frequency domain. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 772–781
2021
-
[21]
Zheda Mai, Ruiwen Li, Hyunwoo Kim, and Scott Sanner. 2021. Supervised contrastive replay: Revisiting the nearest class mean classifier in online class- incremental continual learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3589–3599
2021
-
[22]
Kun Pan, Yifang Yin, Yao Wei, Feng Lin, Zhongjie Ba, Zhenguang Liu, Zhibo Wang, Lorenzo Cavallaro, and Kui Ren. 2023. Dfil: Deepfake incremental learning by exploiting domain-invariant forgery clues. InProceedings of the 31st ACM international conference on multimedia. 8035–8046
2023
-
[23]
Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. 2020. Think- ing in frequency: Face forgery detection by mining frequency-aware clues. In European conference on computer vision. 86–103
2020
-
[24]
Jingyang Qiao, Xin Tan, Chengwei Chen, Yanyun Qu, Yong Peng, Yuan Xie, et al. 2024. Prompt gradient projection for continual learning. Inthe Twelfth international conference on learning representations
2024
-
[25]
Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H Lampert. 2017. icarl: Incremental classifier and representation learning. InPro- ceedings of the IEEE conference on computer vision and pattern recognition. 2001– 2010
2017
-
[26]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10684–10695
2022
-
[27]
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. Faceforensics++: Learning to detect manipulated facial images. InProceedings of the IEEE/CVF international conference on computer vision. 1–11
2019
-
[28]
Andrei A Rusu, Neil C Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, and Raia Hadsell. 2016. Pro- gressive neural networks.arXiv preprint arXiv:1606.04671(2016)
work page internal anchor Pith review arXiv 2016
-
[29]
Gobinda Saha, Isha Garg, and Kaushik Roy. 2021. Gradient Projection Memory for Continual Learning. Ininternational conference on learning representations
2021
-
[30]
Kaede Shiohara and Toshihiko Yamasaki. 2022. Detecting deepfakes with self- blended images. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 18720–18729
2022
-
[31]
Kaede Shiohara, Xingchao Yang, and Takafumi Taketomi. 2023. Blendface: Re- designing identity encoders for face-swapping. InProceedings of the IEEE/CVF international conference on computer vision. 7634–7644
2023
-
[32]
Ke Sun, Shen Chen, Taiping Yao, Xiaoshuai Sun, Shouhong Ding, and Rongrong Ji. 2025. Continual face forgery detection via historical distribution preserving. international journal of computer vision133, 3 (2025), 1067–1084
2025
-
[33]
Ke Sun, Taiping Yao, Shen Chen, Shouhong Ding, Jilin Li, and Rongrong Ji. 2022. Dual contrastive learning for general face forgery detection. InProceedings of the AAAI conference on artificial intelligence, Vol. 36. 2316–2324
2022
-
[34]
Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. InInternational conference on machine learning. 6105–6114
2019
-
[35]
Jiahe Tian, Cai Yu, Xi Wang, Peng Chen, Zihao Xiao, Jizhong Han, and Yesheng Chai. 2024. Dynamic mixed-prototype model for incremental deepfake detection. InProceedings of the 32nd ACM international conference on multimedia. 8129–8138
2024
-
[36]
Ying Xu, Kiran Raja, and Marius Pedersen. 2022. Supervised contrastive learn- ing for generalizable and explainable deepfakes detection. InProceedings of the IEEE/CVF winter conference on applications of computer vision. 379–389
2022
-
[37]
Jiazhen Yan, Ziqiang Li, Fan Wang, Ziwen He, and Zhangjie Fu. 2026. Dual Frequency Branch Framework with Reconstructed Sliding Windows Attention 9 for AI-Generated Image Detection.IEEE Transactions on information forensics and security(2026)
2026
-
[38]
Shipeng Yan, Jiangwei Xie, and Xuming He. 2021. Der: Dynamically expandable representation for class incremental learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3014–3023
2021
-
[39]
Zhiyuan Yan, Taiping Yao, Shen Chen, Yandan Zhao, Xinghe Fu, Junwei Zhu, Donghao Luo, Chengjie Wang, Shouhong Ding, Yunsheng Wu, et al. 2024. Df40: Toward next-generation deepfake detection.Advances in neural information processing systems37 (2024), 29387–29434
2024
-
[40]
Xin Yang, Yuezun Li, and Siwei Lyu. 2019. Exposing deep fakes using inconsistent head poses. InICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing. 8261–8265
2019
- [41]
-
[42]
Nan Zhong, Mian Zou, Yiran Xu, Zhenxing Qian, Xinpeng Zhang, Baoyuan Wu, and Kede Ma. 2026. Self-Supervised AI-Generated Image Detection: A Camera Metadata Perspective.IEEE Transactions on pattern analysis and machine intelligence(2026), 1–16
2026
-
[43]
Dewei Zhou, You Li, Fan Ma, Xiaoting Zhang, and Yi Yang. 2024. Migc: Multi- instance generation controller for text-to-image synthesis. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 6818–6828
2024
-
[44]
Bojia Zi, Minghao Chang, Jingjing Chen, Xingjun Ma, and Yu-Gang Jiang. 2020. Wilddeepfake: A challenging real-world dataset for deepfake detection. InPro- ceedings of the 28th ACM international conference on multimedia. 2382–2390. 10 A. Task Order Robustness for Protocol 1 To evaluate the robustness of the proposed method to task order, an additional expe...
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.