Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection
Pith reviewed 2026-05-17 23:20 UTC · model grok-4.3
The pith
Decomposing features via SVD into orthogonal parts lets detectors freeze general pre-trained knowledge and adapt only the rest to spot AI fakes without overfitting.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Employing singular value decomposition to decompose the original feature space into two orthogonal subspaces, freezing the principal components while adapting only the remained components, preserves the pre-trained knowledge while learning fake patterns. This ensures the higher rank of the whole feature space, minimizes overfitting, and enhances generalization compared to full-parameter and LoRA-based tuning methods. The method also implicitly learns a vital prior that fakes are actually derived from the real, indicating a hierarchical relationship.
What carries the argument
SVD orthogonal subspace decomposition that freezes principal components to retain pre-trained rank and adapts only residual components to learn detection signals.
Load-bearing premise
The largest directions found by SVD on pre-trained vision features hold general visual knowledge that does not overlap with the specific clues needed to detect fakes, so freezing them keeps useful information without blocking detection learning.
What would settle it
Apply the method to a training set of images from several known generators, then evaluate accuracy on a held-out generator never seen in training; if performance matches or falls below a standard fine-tuned baseline, the benefit of freezing principal components is not supported.
read the original abstract
AI-generated images (AIGIs), such as natural or face images, have become increasingly important yet challenging. In this paper, we start from a new perspective to excavate the reason behind the failure generalization in AIGI detection, named the \textit{asymmetry phenomenon}, where a naively trained detector tends to favor overfitting to the limited and monotonous fake patterns, causing the feature space to become highly constrained and low-ranked, which is proved seriously limiting the expressivity and generalization. One potential remedy is incorporating the pre-trained knowledge within the vision foundation models (higher-ranked) to expand the feature space, alleviating the model's overfitting to fake. To this end, we employ Singular Value Decomposition (SVD) to decompose the original feature space into \textit{two orthogonal subspaces}. By freezing the principal components and adapting only the remained components, we preserve the pre-trained knowledge while learning fake patterns. Compared to existing full-parameters and LoRA-based tuning methods, we explicitly ensure orthogonality, enabling the higher rank of the whole feature space, effectively minimizing overfitting and enhancing generalization. We finally identify a crucial insight: our method implicitly learns \textit{a vital prior that fakes are actually derived from the real}, indicating a hierarchical relationship rather than independence. Modeling this prior, we believe, is essential for achieving superior generalization. Our codes are publicly available at \href{https://github.com/YZY-stack/Effort-AIGI-Detection}{GitHub}.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that naive training for AI-generated image (AIGI) detection exhibits an 'asymmetry phenomenon' in which the detector overfits to limited and monotonous fake patterns, collapsing the feature space to low rank and harming generalization. To remedy this, the authors apply Singular Value Decomposition (SVD) to features extracted from pre-trained vision foundation models, decomposing the space into two orthogonal subspaces. They freeze the principal components (to retain general pre-trained knowledge) while adapting only the residual components (to capture fake patterns), explicitly enforcing orthogonality to maintain higher rank, reduce overfitting, and improve generalization relative to full-parameter or LoRA tuning. The work also identifies an implicit prior that fakes are hierarchically derived from reals rather than independent.
Significance. If the SVD partitioning reliably isolates general knowledge from task-specific adaptation without discarding detection-critical directions, the method would supply a principled, orthogonality-aware fine-tuning recipe that directly targets rank collapse, a recurring issue in AIGI generalization. The derived insight about modeling the real-to-fake hierarchical prior could usefully shape subsequent detector design.
major comments (1)
- [Method (SVD decomposition and asymmetry phenomenon)] The load-bearing assumption that the principal components obtained from SVD on pre-trained features encode only general real-image knowledge and lie orthogonal to (and independent of) the directions needed to detect fakes is not justified in the method description. SVD is performed on the variance structure of the pre-trained feature matrix without any real/fake separation; because common AIGI artifacts (frequency biases, diffusion-specific patterns) frequently align with high-variance axes, a non-negligible fraction of the detection signal may reside in the frozen principal subspace. Freezing it would then remove rather than protect useful information, undermining both the orthogonality guarantee and the claimed generalization benefit. This concern directly affects the central claim and requires either a formal argument or targeted ablations showing the contribution of the frozen versus
minor comments (2)
- Specify exactly on which feature matrix (real-only, mixed real/fake, or pre-training corpus) the SVD is computed and how the rank cutoff for the principal subspace is chosen.
- The abstract asserts generalization gains and the implicit prior but supplies no quantitative metrics, ablation tables, or cross-generator results; these must be clearly presented and compared against full fine-tuning and LoRA baselines.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback on our manuscript. The concern regarding the justification of the SVD decomposition assumption is well-taken, and we address it directly below while outlining planned revisions to strengthen the presentation.
read point-by-point responses
-
Referee: [Method (SVD decomposition and asymmetry phenomenon)] The load-bearing assumption that the principal components obtained from SVD on pre-trained features encode only general real-image knowledge and lie orthogonal to (and independent of) the directions needed to detect fakes is not justified in the method description. SVD is performed on the variance structure of the pre-trained feature matrix without any real/fake separation; because common AIGI artifacts (frequency biases, diffusion-specific patterns) frequently align with high-variance axes, a non-negligible fraction of the detection signal may reside in the frozen principal subspace. Freezing it would then remove rather than protect useful information, undermining both the orthogonality guarantee and the claimed generalization benefit. This concern directly affects the central claim and requires either a formal argument or targeted
Authors: We agree that the unsupervised nature of SVD on the pre-trained feature matrix does not explicitly separate real and fake directions, and that certain AIGI artifacts could in principle align with high-variance axes. Our core rationale is that the principal components still predominantly encode the high-rank, general visual priors learned from massive real-image corpora during foundation-model pre-training; the residual subspace then captures the lower-variance deviations that correspond to the hierarchical real-to-fake relationship we identify. The explicit orthogonality constraint we impose further prevents rank collapse even if some detection signal overlaps the principal directions. To directly address the referee’s request, we will add targeted ablations in the revised manuscript that (i) measure detection performance when the principal subspace is progressively unfrozen and (ii) quantify the rank and generalization gap with and without the orthogonality constraint. These experiments will clarify the contribution of each subspace and strengthen the empirical support for our modeling choice. revision: partial
Circularity Check
No circularity: SVD decomposition is an explicit algorithmic choice, not a self-referential reduction
full rationale
The paper's central derivation applies SVD to the feature matrix of a pre-trained vision model, freezes the top singular components, and adapts only the orthogonal residual subspace. This procedure is defined directly by the linear algebra of SVD and the training protocol; the resulting feature space rank and orthogonality follow from the decomposition itself rather than from any fitted parameter that is later renamed as a prediction. The asymmetry phenomenon is presented as an empirical observation motivating the method, and the claim that freezing principal components preserves general knowledge is an interpretive hypothesis evaluated on downstream generalization benchmarks, not a tautology. No self-citation chain, uniqueness theorem, or ansatz smuggling appears in the load-bearing steps. The method is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Feature spaces extracted from pre-trained vision foundation models admit an SVD decomposition in which the principal components encode general knowledge orthogonal to task-specific fake patterns.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we employ Singular Value Decomposition (SVD) to decompose the original feature space into two orthogonal subspaces. By freezing the principal components and adapting only the remained components
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanembed_injective unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
our method implicitly learns a vital prior that fakes are actually derived from the real, indicating a hierarchical relationship rather than independence
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 16 Pith papers
-
LEGO: LoRA-Enabled Generator-Oriented Framework for Synthetic Image Detection
LEGO uses multiple generator-specific LoRA modules modulated by an MLP and fused with attention to detect synthetic images, achieving better performance than prior methods while using under 10% of the training data.
-
Reduce the Artifacts Bias for More Generalizable AI-Generated Image Detection
SEF introduces GAN upsampling for diverse artifacts and expert fusion to reduce domain interference, yielding stronger generalization on 13 benchmarks for AI-generated image detection.
-
Decoupling Semantics and Fingerprints: A Universal Representation for AI-Generated Image Detection
ODP-Net structurally disentangles universal forgery traces from generator fingerprints and semantics via orthogonal decomposition and purification, delivering state-of-the-art generalization to unseen AI image generat...
-
Rethinking Cross-Domain Evaluation for Face Forgery Detection with Semantic Fine-grained Alignment and Mixture-of-Experts
Cross-AUC exposes large robustness drops in existing face forgery detectors across datasets, while the SFAM model with semantic alignment and region-specific experts delivers better performance on public benchmarks.
-
Combating Pattern and Content Bias: Adversarial Feature Learning for Generalized AI-Generated Image Detection
MAFL uses adversarial training to suppress pattern and content biases, guiding models to learn shared generative features for better cross-model generalization in detecting AI images.
-
Simplicity Prevails: The Emergence of Generalizable AIGI Detection in Visual Foundation Models
Frozen features from vision foundation models enable a linear probe to outperform specialized AIGI detectors by over 30% on in-the-wild data due to emergent forgery knowledge from pre-training.
-
Scaling Up AI-Generated Image Detection with Generator-Aware Prototypes
GAPL learns a compact set of canonical forgery prototypes and applies two-stage LoRA training to build a low-variance feature space that improves generalization across GAN and diffusion generators.
-
How Noise Benefits AI-generated Image Detection
PiN-CLIP jointly trains a noise generator and detector under a variational positive-incentive principle to inject feature-space noise that suppresses shortcut directions and improves out-of-distribution accuracy by 5....
-
Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts
MDMF detects AI-generated images by learning patch-level forensic signatures and quantifying their distributional discrepancies with MMD, yielding larger separation than global methods when micro-defects are present.
-
VRAG-DFD: Verifiable Retrieval-Augmentation for MLLM-based Deepfake Detection
VRAG-DFD uses RAG to retrieve forgery knowledge and RL-based training to build critical reasoning in MLLMs, delivering state-of-the-art generalization on deepfake detection tasks.
-
LOGER: Local--Global Ensemble for Robust Deepfake Detection in the Wild
LOGER ensembles heterogeneous global vision models with selective local patch aggregation via multiple instance learning to achieve robust deepfake detection across varied manipulations and degradations.
-
Robust Deepfake Detection: Mitigating Spatial Attention Drift via Calibrated Complementary Ensembles
A multi-stream ensemble using DINOv2 and CLIP backbones trained with extreme degradations achieves stable deepfake detection and fourth place in the NTIRE 2026 challenge.
-
Towards Generalizable Deepfake Image Detection with Vision Transformers
Ensemble of vision transformers reaches 96.77% AUC and 9% EER on DF-Wild deepfake test set, outperforming the prior Effort baseline by 7% AUC and 8% EER.
-
Adaptive Forensic Feature Refinement via Intrinsic Importance Perception
I2P adaptively selects the most discriminative layers from visual foundation models for synthetic image detection and constrains task updates to low-sensitivity parameter subspaces to improve specificity without harmi...
-
Boosting Robust AIGI Detection with LoRA-based Pairwise Training
LoRA-based pairwise training with distortion and size simulations boosts robust AIGI detection under severe distortions, placing third in the NTIRE challenge.
-
HEDGE: Heterogeneous Ensemble for Detection of AI-GEnerated Images in the Wild
HEDGE is a heterogeneous ensemble using progressive DINOv3 training, multi-scale features, and MetaCLIP2 diversity with dual-gating fusion to achieve robust AI-generated image detection and 4th place in the NTIRE 2026...
Reference graph
Works this paper leans on
-
[1]
Wukong, 2022. 5. In https://xihe.mindspore.cn/modelzoo/wukong, 2022. 5
work page 2022
-
[3]
Brock, A. et al. Large scale gan training for high fidelity natural image synthesis. In ICLR, 2018 b
work page 2018
-
[4]
End-to-end reconstruction-classification learning for face forgery detection
Cao, J., Ma, C., Yao, T., Chen, S., Ding, S., and Yang, X. End-to-end reconstruction-classification learning for face forgery detection. In CVPR, pp.\ 4113--4122, 2022
work page 2022
-
[5]
What makes fake images detectable? understanding properties that generalize
Chai, L., Bau, D., Lim, S.-N., and Isola, P. What makes fake images detectable? understanding properties that generalize. In ECCV, pp.\ 103--120. Springer, 2020
work page 2020
-
[6]
Chen, B., Zeng, J., Yang, J., and Yang, R. Drct: Diffusion reconstruction contrastive training towards universal detection of diffusion generated images. In ICML, 2024
work page 2024
-
[7]
Chen, C., Chen, Q., Xu, J., and Koltun, V. Learning to see in the dark. In CVPR, 2018
work page 2018
-
[8]
Self-supervised learning of adversarial example: Towards good generalizations for deepfake detection
Chen, L., Zhang, Y., Song, Y., Liu, L., and Wang, J. Self-supervised learning of adversarial example: Towards good generalizations for deepfake detection. In CVPR, pp.\ 18710--18719, 2022 a
work page 2022
-
[9]
Ost: Improving generalization of deepfake detection via one-shot test-time training
Chen, L., Zhang, Y., Song, Y., Wang, J., and Liu, L. Ost: Improving generalization of deepfake detection via one-shot test-time training. In NeurIPS, 2022 b
work page 2022
-
[10]
Chen, Q. and Koltun, V. Photographic image synthesis with cascaded refinement networks. In ICCV, 2017
work page 2017
-
[11]
Can we leave deepfake data behind in training deepfake detector? NeurIPS, 2024
Cheng, J., Yan, Z., Zhang, Y., Luo, Y., Wang, Z., and Li, C. Can we leave deepfake data behind in training deepfake detector? NeurIPS, 2024
work page 2024
-
[12]
Exploiting style latent flows for generalizing deepfake video detection
Choi, J., Kim, T., Jeong, Y., Baek, S., and Choi, J. Exploiting style latent flows for generalizing deepfake video detection. In CVPR, pp.\ 1133--1143, 2024
work page 2024
-
[13]
Stargan: Unified generative adversarial networks for multi-domain image-to-image translation
Choi, Y., Choi, M., Kim, M., Ha, J.-W., Kim, S., and Choo, J. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In CVPR, 2018
work page 2018
-
[14]
Second-order attention network for single image super-resolution
Dai, T., Cai, J., Zhang, Y., Xia, S.-T., and Lei, Z. Second-order attention network for single image super-resolution. In CVPR, 2019
work page 2019
-
[15]
https://www.kaggle.com/c/deepfake-detection-challenge Accessed 2021-04-24
detection challenge., D., 2020. https://www.kaggle.com/c/deepfake-detection-challenge Accessed 2021-04-24
work page 2020
-
[16]
https://ai.googleblog.com/2019/09/contributing-data-to-deepfake-detection.html Accessed 2021-04-24
DFD., 2020. https://ai.googleblog.com/2019/09/contributing-data-to-deepfake-detection.html Accessed 2021-04-24
work page 2020
-
[17]
Dhariwal, P. and Nichol, A. Diffusion models beat gans on image synthesis. In NeurIPS, 2021
work page 2021
-
[18]
Dhariwal, P. et al. Diffusion models beat gans on image synthesis. NeurIPS, 34: 0 8780--8794, 2021
work page 2021
-
[19]
Parameter-efficient fine-tuning of large-scale pre-trained language models
Ding, N., Qin, Y., Yang, G., Wei, F., Yang, Z., Su, Y., Hu, S., Chen, Y., Chan, C.-M., Chen, W., et al. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature Machine Intelligence, 5 0 (3): 0 220--235, 2023
work page 2023
- [20]
-
[21]
Implicit identity leakage: The stumbling block to improving deepfake detection generalization
Dong, S., Wang, J., Ji, R., Liang, J., Fan, H., and Ge, Z. Implicit identity leakage: The stumbling block to improving deepfake detection generalization. In CVPR, pp.\ 3994--4004, 2023
work page 2023
-
[22]
Exploring unbiased deepfake detection via token-level shuffling and mixing
Fu, X., Yan, Z., Yao, T., Chen, S., and Li, X. Exploring unbiased deepfake detection via token-level shuffling and mixing. In AAAI, 2025
work page 2025
-
[23]
Generative adversarial networks
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial networks. Communications of the ACM, 63 0 (11): 0 139--144, 2020
work page 2020
-
[24]
Vector quantized diffusion model for text-to-image synthesis
Gu, S., Chen, D., Bao, J., Wen, F., Zhang, B., Chen, D., Yuan, L., and Guo, B. Vector quantized diffusion model for text-to-image synthesis. In CVPR, pp.\ 10696--10706, 2022 a
work page 2022
-
[25]
Hierarchical contrastive inconsistency learning for deepfake video detection
Gu, Z., Yao, T., Chen, Y., Ding, S., and Ma, L. Hierarchical contrastive inconsistency learning for deepfake video detection. In ECCV, pp.\ 596--613. Springer, 2022 b
work page 2022
-
[26]
Delving into sequential patches for deepfake detection
Guan, J., Zhou, H., Hong, Z., Ding, E., Wang, J., Quan, C., and Zhao, Y. Delving into sequential patches for deepfake detection. NeurIPS, 35: 0 4517--4530, 2022
work page 2022
-
[27]
E., Bhojanapalli, S., Neyshabur, B., and Srebro, N
Gunasekar, S., Woodworth, B. E., Bhojanapalli, S., Neyshabur, B., and Srebro, N. Implicit regularization in matrix factorization. NeurIPS, 30, 2017
work page 2017
-
[28]
Lips don't lie: A generalisable and robust approach to face forgery detection
Haliassos, A., Vougioukas, K., Petridis, S., and Pantic, M. Lips don't lie: A generalisable and robust approach to face forgery detection. In CVPR, 2021
work page 2021
-
[29]
Leveraging real talking faces via self-supervision for robust forgery detection
Haliassos, A., Mira, R., Petridis, S., and Pantic, M. Leveraging real talking faces via self-supervision for robust forgery detection. In CVPR, pp.\ 14950--14962, 2022
work page 2022
-
[30]
Deep residual learning for image recognition
He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In CVPR, pp.\ 770--778, 2016
work page 2016
-
[31]
Denoising diffusion probabilistic models
Ho, J., Jain, A., and Abbeel, P. Denoising diffusion probabilistic models. NeurIPS, 33: 0 6840--6851, 2020
work page 2020
-
[33]
Implicit identity driven deepfake face swapping detection
Huang, B., Wang, Z., Yang, J., Ai, J., Zou, Q., Wang, Q., and Ye, D. Implicit identity driven deepfake face swapping detection. In CVPR, pp.\ 4490--4499, 2023
work page 2023
-
[34]
Jeong, Y. et al. Bihpf: Bilateral high-pass filters for robust deepfake detection. In WACV, pp.\ 48--57, 2022
work page 2022
-
[35]
Jiang, L., Li, R., Wu, W., Qian, C., and Loy, C. C. Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection. In CVPR, 2020
work page 2020
-
[36]
Progressive growing of gans for improved quality, stability, and variation
Karras, T., Aila, T., Laine, S., and Lehtinen, J. Progressive growing of gans for improved quality, stability, and variation. In ICLR, 2018
work page 2018
-
[37]
A style-based generator architecture for generative adversarial networks
Karras, T., Laine, S., and Aila, T. A style-based generator architecture for generative adversarial networks. In CVPR, 2019
work page 2019
-
[38]
Khalid, H. and Woo, S. S. Oc-fakedect: Classifying deepfakes using one-class variational autoencoder. In CVPRW, pp.\ 656--657, 2020
work page 2020
-
[40]
Enhancing general face forgery detection via vision transformer with low-rank adaptation
Kong, C., Li, H., and Wang, S. Enhancing general face forgery detection via vision transformer with low-rank adaptation. In ICCV, pp.\ 102--107. IEEE, 2023
work page 2023
-
[42]
DeepFakes: a New Threat to Face Recognition? Assessment and Detection
Korshunov, P. and Marcel, S. Deepfakes: a new threat to face recognition? assessment and detection. arXiv preprint arXiv:1812.08685, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[43]
Seeable: Soft discrepancies and bounded contrastive learning for exposing deepfakes
Larue, N., Vu, N.-S., Struc, V., Peer, P., and Christophides, V. Seeable: Soft discrepancies and bounded contrastive learning for exposing deepfakes. In ICCV, pp.\ 21011--21021, 2023
work page 2023
-
[44]
Learning to generalize: Meta-learning for domain generalization
Li, D., Yang, Y., Song, Y.-Z., and Hospedales, T. Learning to generalize: Meta-learning for domain generalization. In AAAI, volume 32, 2018
work page 2018
-
[45]
Li, J., Xie, H., Li, J., Wang, Z., and Zhang, Y. Frequency-aware discriminative feature learning supervised by single-center loss for face forgery detection. In CVPR, 2021
work page 2021
-
[46]
Diverse image synthesis from semantic layouts via conditional imle
Li, K., Zhang, T., and Malik, J. Diverse image synthesis from semantic layouts via conditional imle. In ICCV, 2019
work page 2019
-
[47]
Face x-ray for more general face forgery detection
Li, L., Bao, J., Zhang, T., Yang, H., Chen, D., Wen, F., and Guo, B. Face x-ray for more general face forgery detection. In CVPR, 2020 a
work page 2020
-
[48]
Celeb-df: A new dataset for deepfake forensics
Li, Y., Yang, X., Sun, P., Qi, H., and Lyu, S. Celeb-df: A new dataset for deepfake forensics. In CVPR, 2020 b
work page 2020
-
[50]
Spatial-phase shallow learning: rethinking face forgery detection in frequency domain
Liu, H., Li, X., Zhou, W., Chen, Y., He, Y., Xue, H., Zhang, W., and Yu, N. Spatial-phase shallow learning: rethinking face forgery detection in frequency domain. In CVPR, 2021 a
work page 2021
-
[51]
Forgery-aware adaptive transformer for generalizable synthetic image detection
Liu, H., Tan, Z., Tan, C., Wei, Y., Wang, J., and Zhao, Y. Forgery-aware adaptive transformer for generalizable synthetic image detection. In CVPR, pp.\ 10770--10780, 2024
work page 2024
-
[52]
Swin transformer: Hierarchical vision transformer using shifted windows
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, pp.\ 10012--10022, 2021 b
work page 2021
-
[53]
Liu, Z. et al. Global texture enhancement for fake face detection in the wild. In CVPR, pp.\ 8060--8069, 2020
work page 2020
-
[55]
Luo, A., Kong, C., Huang, J., Hu, Y., Kang, X., and Kot, A. C. Beyond the prior forgery knowledge: Mining critical clues for general face forgery detection. IEEE TIFS, 19: 0 1168--1182, 2023 b
work page 2023
-
[56]
Generalizing face forgery detection with high-frequency features
Luo, Y., Zhang, Y., Yan, J., and Liu, W. Generalizing face forgery detection with high-frequency features. In CVPR, 2021
work page 2021
-
[57]
Lare\^ 2 : Latent reconstruction error based method for diffusion-generated image detection
Luo, Y., Du, J., Yan, K., and Ding, S. Lare\^ 2 : Latent reconstruction error based method for diffusion-generated image detection. In CVPR, pp.\ 17006--17015, 2024
work page 2024
-
[58]
F 2 trans: High-frequency fine-grained transformer for face forgery detection
Miao, C., Tan, Z., Chu, Q., Liu, H., Hu, H., and Yu, N. F 2 trans: High-frequency fine-grained transformer for face forgery detection. IEEE TIFS, 18: 0 1039--1051, 2023
work page 2023
- [59]
-
[60]
Mohri, M. and Rostamizadeh, A. Rademacher complexity bounds for non-iid processes. Advances in neural information processing systems, 21, 2008
work page 2008
-
[61]
M., Chandrasekaran, S., Flenner, A., Bappy, J
Nataraj, L., Mohammed, T. M., Chandrasekaran, S., Flenner, A., Bappy, J. H., Roy-Chowdhury, A. K., and Manjunath, B. Detecting gan generated fake images using co-occurrence matrices. arXiv preprint arXiv:1903.06836, 2019
-
[62]
Core: Consistent representation learning for face forgery detection
Ni, Y., Meng, D., Yu, C., Quan, C., Ren, D., and Zhao, Y. Core: Consistent representation learning for face forgery detection. In CVPRW, pp.\ 12--21, 2022
work page 2022
-
[64]
Glide: Towards photorealistic image generation and editing with text-guided diffusion models
Nichol, A., Dhariwal, P., Ramesh, A., Shyam, P., Mishkin, P., McGrew, B., Sutskever, I., and Chen, M. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. In ICML, 2022
work page 2022
-
[65]
Ojha, U. et al. Towards universal fake image detectors that generalize across generative models. In CVPR, pp.\ 24480--24489, 2023
work page 2023
-
[66]
Semantic image synthesis with spatially-adaptive normalization
Park, T., Liu, M.-Y., Wang, T.-C., and Zhu, J.-Y. Semantic image synthesis with spatially-adaptive normalization. In CVPR, 2019
work page 2019
-
[67]
Beit v2: Masked image modeling with vector-quantized visual tokenizers
Peng, Z., Dong, L., Bao, H., Ye, Q., and Wei, F. Beit v2: Masked image modeling with vector-quantized visual tokenizers. arXiv preprint arXiv:2208.06366, 2022
-
[68]
Thinking in frequency: Face forgery detection by mining frequency-aware clues
Qian, Y., Yin, G., Sheng, L., Chen, Z., and Shao, J. Thinking in frequency: Face forgery detection by mining frequency-aware clues. In ECCV, pp.\ 86--103. Springer, 2020
work page 2020
-
[69]
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. Learning transferable visual models from natural language supervision. In ICML, pp.\ 8748--8763. PMLR, 2021
work page 2021
-
[70]
Zero-shot text-to-image generation
Ramesh, A., Pavlov, M., Goh, G., Gray, S., Voss, C., Radford, A., Chen, M., and Sutskever, I. Zero-shot text-to-image generation. In ICML, 2021
work page 2021
-
[71]
High-resolution image synthesis with latent diffusion models
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. High-resolution image synthesis with latent diffusion models. In CVPR, 2022 a
work page 2022
-
[72]
High-resolution image synthesis with latent diffusion models
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. High-resolution image synthesis with latent diffusion models. In CVPR, pp.\ 10684--10695, 2022 b
work page 2022
-
[73]
Face F orensics++: Learning to detect manipulated facial images
R\"ossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., and Nie ner, M. Face F orensics++: Learning to detect manipulated facial images. In ICCV, 2019
work page 2019
-
[74]
Faceforensics++: Learning to detect manipulated facial images
Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., and Nie ner, M. Faceforensics++: Learning to detect manipulated facial images. In ICCV, pp.\ 1--11, 2019
work page 2019
-
[76]
Shiohara, K. and Yamasaki, T. Detecting deepfakes with self-blended images. In CVPR, pp.\ 18720--18729, 2022
work page 2022
-
[77]
Blendface: Re-designing identity encoders for face-swapping
Shiohara, K., Yang, X., and Taketomi, T. Blendface: Re-designing identity encoders for face-swapping. In ICCV, pp.\ 7634--7644, 2023
work page 2023
-
[78]
Domain general face forgery detection by learning to weight
Sun, K., Liu, H., Ye, Q., Gao, Y., Liu, J., Shao, L., and Ji, R. Domain general face forgery detection by learning to weight. In AAAI, volume 35, pp.\ 2638--2646, 2021
work page 2021
-
[79]
Dual contrastive learning for general face forgery detection
Sun, K., Yao, T., Chen, S., Ding, S., Li, J., and Ji, R. Dual contrastive learning for general face forgery detection. In AAAI, volume 36, pp.\ 2316--2324, 2022
work page 2022
-
[80]
Learning on gradients: Generalized artifacts representation for gan-generated images detection
Tan, C., Zhao, Y., Wei, S., Gu, G., and Wei, Y. Learning on gradients: Generalized artifacts representation for gan-generated images detection. In CVPR, pp.\ 12105--12114, June 2023
work page 2023
-
[81]
Tan, C., Liu, P., Tao, R., Liu, H., Zhao, Y., Wu, B., and Wei, Y. Data-independent operator: A training-free artifact representation extractor for generalizable deepfake detection. arXiv preprint arXiv:2403.06803, 2024 a
-
[82]
Tan, C., Zhao, Y., Wei, S., Gu, G., Liu, P., and Wei, Y. Frequency-aware deepfake detection: Improving generalizability through frequency space domain learning. In AAAI, volume 38, pp.\ 5052--5060, 2024 b
work page 2024
-
[83]
Tan, C., Zhao, Y., Wei, S., Gu, G., Liu, P., and Wei, Y. Rethinking the up-sampling operations in cnn-based generative network for generalizable deepfake detection. In CVPR, pp.\ 28130--28139, 2024 c
work page 2024
- [84]
-
[85]
Face2face: Real-time face capture and reenactment of rgb videos
Thies, J., Zollhofer, M., Stamminger, M., Theobalt, C., and Nie ner, M. Face2face: Real-time face capture and reenactment of rgb videos. In CVPR, 2016
work page 2016
-
[86]
Training data-efficient image transformers & distillation through attention
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., and J \'e gou, H. Training data-efficient image transformers & distillation through attention. In ICML, pp.\ 10347--10357. PMLR, 2021
work page 2021
-
[87]
Trinh, L. and Liu, Y. An examination of fairness of ai models for deepfake detection. arXiv, 2021
work page 2021
-
[88]
M2tr: Multi-modal multi-scale transformers for deepfake detection
Wang, J., Wu, Z., Ouyang, W., Han, X., Chen, J., Jiang, Y.-G., and Li, S.-N. M2tr: Multi-modal multi-scale transformers for deepfake detection. In ICMR, pp.\ 615--623, 2022
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.