Recognition: no theorem link
Reduce the Artifacts Bias for More Generalizable AI-Generated Image Detection
Pith reviewed 2026-05-15 02:33 UTC · model grok-4.3
The pith
A GAN-based upsampling method plus Separate Expert Fusion reduces artifact bias and improves generalization in AI-generated image detection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that introducing aligned yet distinct artifact patterns through GAN-based upsampling, then extracting and fusing them via domain-specific LoRA experts and a gating network, lets the detector learn forgery cues that are less biased toward any single generation family and therefore perform better across broader sets of AI image generators.
What carries the argument
The Separate Expert Fusion (SEF) framework: domain-specific experts trained by LoRA adaptation on a frozen foundation model, followed by decoupled fusion through a gating network that combines specialized features without cross-domain interference.
If this is right
- Detection accuracy rises on GAN-generated images because the added artifact patterns fill a coverage gap.
- Performance improves across thirteen benchmarks spanning multiple generative families.
- Domain interference is avoided during training, preserving each expert's specialized knowledge.
- The learned decision boundary becomes more robust to variations in generative methods.
Where Pith is reading between the lines
- The same expert-fusion pattern could be applied to other media where reconstruction and synthesis artifacts differ in character.
- New generator families could be incorporated by designing matching upsampling procedures that keep alignment.
- Real-world deployment would benefit from checking whether the fused experts remain effective when test images mix multiple unknown generators.
Load-bearing premise
The GAN-based upsampling produces artifact patterns that remain aligned with reconstruction fakes in content, size, and format while being distinct enough to supply useful complementary information.
What would settle it
Training the model on the proposed paired fakes and then testing on a held-out generator family where detection accuracy shows no gain over a reconstruction-only baseline would falsify the generalization benefit.
Figures
read the original abstract
As the misuse of AI-generated images grows, generalizable image detection techniques are urgently needed. Recent state-of-the-art (SOTA) methods adopt aligned training datasets to reduce content, size, and format biases, empowering models to capture robust forgery cues. A common strategy is to employ reconstruction techniques, e.g., VAE and DDIM, which show remarkable results in diffusion-based methods. However, such reconstruction-based approaches typically introduce limited and homogeneous artifacts, which cannot fully capture diverse generative patterns, such as GAN-based methods. To complement reconstruction-based fake images with aligned yet diverse artifact patterns, we propose a GAN-based upsampling approach that mimics GAN-generated fake patterns while preserving content, size, and format alignment. This naturally results in two aligned but distinct types of fake images. However, due to the domain shift between reconstruction-based and upsampling-based fake images, direct mixed training causes suboptimal results, where one domain disrupts feature learning of the other. Accordingly, we propose a Separate Expert Fusion (SEF) framework to extract complementary artifact information and reduce inter-domain interference. We first train domain-specific experts via LoRA adaptation on a frozen foundational model, then conduct decoupled fusion with a gating network to adaptively combine expert features while retaining their specialized knowledge. Rather than merely benefiting GAN-generated image detection, this design introduces diverse and complementary artifact patterns that enable SEF to learn a more robust decision boundary and improve generalization across broader generative methods. Extensive experiments demonstrate that our method yields strong results across 13 diverse benchmarks. Codes are released at: https://github.com/liyih/SEF_AIGC_detection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that reconstruction-based fakes (VAE/DDIM) produce limited homogeneous artifacts, so a GAN-based upsampling method is introduced to generate aligned yet diverse fake images; because direct mixing causes domain-shift interference, a Separate Expert Fusion (SEF) framework trains LoRA experts on a frozen backbone and uses a gating network for decoupled fusion, yielding a more robust decision boundary and improved generalization across 13 benchmarks.
Significance. If the complementarity of the two artifact families is demonstrated and the reported gains hold under controlled ablations, the work would provide a practical route to reduce content/size/format bias while expanding coverage to both diffusion and GAN generators; the code release is a clear positive for reproducibility.
major comments (3)
- [Abstract and §4] Abstract and §4 (Experiments): the central claim of strong generalization across 13 benchmarks is asserted without any reported baseline numbers, ablation tables, statistical significance tests, or precise train/test splits, so the performance gains cannot be verified or attributed to the proposed components.
- [§3.2 and §3.3] §3.2 (GAN-based upsampling) and §3.3 (SEF): the key assumption that upsampling artifacts are both content-aligned and sufficiently orthogonal to reconstruction artifacts is unsupported by any quantitative measure (distribution distances, per-expert activation statistics, or orthogonality metrics), leaving the rationale for SEF and the claimed complementarity unverified.
- [§4] §4: the statement that direct mixed training is suboptimal is presented without a side-by-side quantitative comparison (e.g., accuracy or AUC tables for mixed training versus SEF), so the necessity of the gating network and the interference-reduction benefit remain unshown.
minor comments (1)
- [§3.3] Notation for the gating network and LoRA rank could be introduced earlier and used consistently in equations.
Simulated Author's Rebuttal
We sincerely thank the referee for the constructive and detailed feedback. We agree that additional experimental details, quantitative validations, and direct comparisons will strengthen the manuscript. We will revise accordingly and address each major comment below.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (Experiments): the central claim of strong generalization across 13 benchmarks is asserted without any reported baseline numbers, ablation tables, statistical significance tests, or precise train/test splits, so the performance gains cannot be verified or attributed to the proposed components.
Authors: We acknowledge that the current presentation in the abstract and §4 lacks sufficient detail for full verification. In the revised version, we will add explicit baseline comparisons against SOTA methods, complete ablation tables, statistical significance tests (e.g., paired t-tests with p-values across 5 runs), and precise descriptions of train/test splits for all 13 benchmarks. This will make the reported gains verifiable and attributable to the proposed GAN upsampling and SEF components. revision: yes
-
Referee: [§3.2 and §3.3] §3.2 (GAN-based upsampling) and §3.3 (SEF): the key assumption that upsampling artifacts are both content-aligned and sufficiently orthogonal to reconstruction artifacts is unsupported by any quantitative measure (distribution distances, per-expert activation statistics, or orthogonality metrics), leaving the rationale for SEF and the claimed complementarity unverified.
Authors: We agree that quantitative evidence for alignment and orthogonality would better support the design rationale. In the revision, we will add FID and LPIPS scores to quantify content/size/format alignment of the GAN-upsampled images, along with per-expert activation statistics and cosine similarity metrics between reconstruction and upsampling expert features to demonstrate their complementarity and reduced interference. revision: yes
-
Referee: [§4] §4: the statement that direct mixed training is suboptimal is presented without a side-by-side quantitative comparison (e.g., accuracy or AUC tables for mixed training versus SEF), so the necessity of the gating network and the interference-reduction benefit remain unshown.
Authors: We will include a new side-by-side comparison table in §4 reporting accuracy and AUC for direct mixed training versus the full SEF framework across the 13 benchmarks. This will quantitatively illustrate the performance drop due to domain interference in mixed training and the benefit provided by the gating network. revision: yes
Circularity Check
No circularity in empirical framework
full rationale
The paper proposes an empirical training framework (GAN-based upsampling for aligned artifacts + SEF with LoRA domain experts and gating fusion) and validates it via experiments on 13 benchmarks. No equations, derivations, or fitted parameters are presented that reduce the reported generalization gains to quantities defined by construction from the inputs. Claims of complementary artifact patterns rest on experimental outcomes rather than self-referential definitions or self-citation chains. The method is self-contained against external benchmarks with no load-bearing reductions to prior author work.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Reconstruction techniques introduce limited and homogeneous artifacts that cannot fully capture diverse generative patterns such as GAN-based methods
- domain assumption Direct mixed training of reconstruction-based and upsampling-based fakes causes suboptimal results due to domain shift
Reference graph
Works this paper leans on
-
[1]
Quentin Bammey. Synthbuster: Towards detection of diffusion model generated images.IEEE Open Journal of Signal Processing, 5:1–9, 2023
work page 2023
-
[2]
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale gan training for high fidelity natural image synthesis.arXiv preprint arXiv:1809.11096, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[3]
Lvpan Cai, Haowei Wang, Jiayi Ji, YanShu ZhouMen, Shen Chen, Taiping Yao, and Xiaoshuai Sun. Zooming in on fakes: A novel dataset for localized ai-generated image detection with forgery amplification approach. InProceedings of the AAAI Conference on Artificial Intelligence, pages 2534–2542, 2026
work page 2026
-
[4]
Emerging properties in self-supervised vision transformers
Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging properties in self-supervised vision transformers. InProceedings of the IEEE/CVF international conference on computer vision, pages 9650–9660, 2021
work page 2021
-
[5]
Real-time deepfake detection in the real-world
Bar Cavia, Eliahu Horwitz, Tal Reiss, and Yedid Hoshen. Real-time deepfake detection in the real-world. arXiv preprint arXiv:2406.09398, 2024
-
[6]
Baoying Chen, Jishen Zeng, Jianquan Yang, and Rui Yang. Drct: Diffusion reconstruction contrastive training towards universal detection of diffusion generated images. InForty-first International Conference on Machine Learning, 2024
work page 2024
-
[7]
Dual data alignment makes ai-generated image detector easier generalizable
Ruoxin Chen, Junwei Xi, Zhiyuan Yan, Ke-Yue Zhang, Shuang Wu, Jingyi Xie, Xu Chen, Lei Xu, Isabel Guan, Taiping Yao, et al. Dual data alignment makes ai-generated image detector easier generalizable. arXiv preprint arXiv:2505.14359, 2025
-
[8]
Co-spy: Combining semantic and pixel features to detect synthetic images by ai
Siyuan Cheng, Lingjuan Lyu, Zhenting Wang, Xiangyu Zhang, and Vikash Sehwag. Co-spy: Combining semantic and pixel features to detect synthetic images by ai. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 13455–13465, 2025
work page 2025
-
[9]
Stargan: Unified generative adversarial networks for multi-domain image-to-image translation
Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 8789–8797, 2018
work page 2018
-
[10]
Fire: Robust detection of diffusion-generated images via frequency-guided reconstruction error
Beilin Chu, Xuan Xu, Xin Wang, Yufei Zhang, Weike You, and Linna Zhou. Fire: Robust detection of diffusion-generated images via frequency-guided reconstruction error. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 12830–12839, 2025
work page 2025
-
[11]
Raising the bar of ai-generated image detection with clip
Davide Cozzolino, Giovanni Poggi, Riccardo Corvi, Matthias Nießner, and Luisa Verdoliva. Raising the bar of ai-generated image detection with clip. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4356–4366, 2024
work page 2024
-
[12]
Imagen3.https://deepmind.google/technologies/imagen-3
Google DeepMind. Imagen3.https://deepmind.google/technologies/imagen-3. 2024
work page 2024
-
[13]
Prafulla Dhariwal and Alexander Nichol. Diffusion models beat gans on image synthesis.Advances in neural information processing systems, 34:8780–8794, 2021
work page 2021
-
[14]
Ricard Durall, Margret Keuper, and Janis Keuper. Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7890–7899, 2020
work page 2020
-
[15]
Scaling rectified flow transformers for high-resolution image synthesis
Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling rectified flow transformers for high-resolution image synthesis. InForty-first international conference on machine learning, 2024. 10
work page 2024
-
[16]
Leveraging frequency analysis for deep fake image recognition
Joel Frank, Thorsten Eisenhofer, Lea Schönherr, Asja Fischer, Dorothea Kolossa, and Thorsten Holz. Leveraging frequency analysis for deep fake image recognition. InInternational conference on machine learning, pages 3247–3258. PMLR, 2020
work page 2020
-
[17]
Generative adversarial nets.Advances in neural information processing systems, 27, 2014
Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets.Advances in neural information processing systems, 27, 2014
work page 2014
-
[18]
Fake or jpeg? revealing common biases in generated image detection datasets
Patrick Grommelt, Louis Weiss, Franz-Josef Pfreundt, and Janis Keuper. Fake or jpeg? revealing common biases in generated image detection datasets. InEuropean Conference on Computer Vision, pages 80–95. Springer, 2024
work page 2024
-
[19]
Vector quantized diffusion model for text-to-image synthesis
Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen, Lu Yuan, and Baining Guo. Vector quantized diffusion model for text-to-image synthesis. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10696–10706, 2022
work page 2022
-
[20]
A bias-free training paradigm for more general ai-generated image detection
Fabrizio Guillaro, Giada Zingarini, Ben Usman, Avneesh Sud, Davide Cozzolino, and Luisa Verdoliva. A bias-free training paradigm for more general ai-generated image detection. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 18685–18694, 2025
work page 2025
-
[21]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020
work page 2020
-
[22]
Lora: Low-rank adaptation of large language models.Iclr, 1(2):3, 2022
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Liang Wang, Weizhu Chen, et al. Lora: Low-rank adaptation of large language models.Iclr, 1(2):3, 2022
work page 2022
-
[23]
Yiwen Huang, Aaron Gokaslan, V olodymyr Kuleshov, and James Tompkin. The gan is dead; long live the gan! a modern gan baseline.Advances in Neural Information Processing Systems, 37:44177–44215, 2024
work page 2024
-
[24]
Bihpf: Bilateral high-pass filters for robust deepfake detection
Yonghyun Jeong, Doyeon Kim, Seungjai Min, Seongho Joe, Youngjune Gwon, and Jongwon Choi. Bihpf: Bilateral high-pass filters for robust deepfake detection. InProceedings of the IEEE/CVF winter conference on applications of computer vision, pages 48–57, 2022
work page 2022
-
[25]
Secret lies in color: Enhancing ai-generated images detection with color distribution analysis
Zexi Jia, Chuanwei Huang, Yeshuang Zhu, Hongyan Fei, Xiaoyue Duan, Zhiqiang Yuan, Ying Deng, Jiapei Zhang, Jinchao Zhang, and Jie Zhou. Secret lies in color: Enhancing ai-generated images detection with color distribution analysis. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 13445–13454, 2025
work page 2025
-
[26]
Progressive Growing of GANs for Improved Quality, Stability, and Variation
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation.arXiv preprint arXiv:1710.10196, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[27]
Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. Alias-free generative adversarial networks.Advances in neural information processing systems, 34: 852–863, 2021
work page 2021
-
[28]
Black Forest Labs. Flux.1-dev. https://huggingface.co/black-forest-labs/FLUX.1-dev . 2024
work page 2024
-
[29]
Photo-realistic single image super- resolution using a generative adversarial network
Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. Photo-realistic single image super- resolution using a generative adversarial network. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 4681–4690, 2017
work page 2017
-
[30]
Improving synthetic image detection towards generalization: An image transformation perspective
Ouxiang Li, Jiayin Cai, Yanbin Hao, Xiaolong Jiang, Yao Hu, and Fuli Feng. Improving synthetic image detection towards generalization: An image transformation perspective. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 1, pages 2405–2414, 2025
work page 2025
-
[31]
Yiheng Li, Zichang Tan, Zhen Lei, Xu Zhou, and Yang Yang. Towards generalizable ai-generated image detection via image-adaptive prompt learning.arXiv preprint arXiv:2508.01603, 2025
-
[32]
arXiv preprint arXiv:2505.12335 , year=
Ziqiang Li, Jiazhen Yan, Ziwen He, Kai Zeng, Weiwei Jiang, Lizhi Xiong, and Zhangjie Fu. Is artificial intelligence generated image detection a solved problem?arXiv preprint arXiv:2505.12335, 2025
-
[33]
Shuqiao Liang, Jian Liu, Renzhang Chen, and Quanlong Guan. Ferretnet: Efficient synthetic image detection via local pixel dependencies.arXiv preprint arXiv:2509.20890, 2025
-
[34]
Microsoft coco: Common objects in context
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. InEuropean conference on computer vision, pages 740–755. Springer, 2014. 11
work page 2014
-
[35]
Forgery- aware adaptive transformer for generalizable synthetic image detection
Huan Liu, Zichang Tan, Chuangchuang Tan, Yunchao Wei, Jingdong Wang, and Yao Zhao. Forgery- aware adaptive transformer for generalizable synthetic image detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10770–10780, 2024
work page 2024
-
[36]
Ruiqi Liu, Yi Han, Zhengbo Zhang, Liwei Yao, Zhiyuan Yan, Jialiang Shen, ZhiJin Chen, Boyi Sun, Lubin Weng, Jing Dong, et al. Beyond artifacts: Real-centric envelope modeling for reliable ai-generated image detection.arXiv preprint arXiv:2512.20937, 2025
-
[37]
Lareˆ 2: Latent reconstruction error based method for diffusion-generated image detection
Yunpeng Luo, Junlong Du, Ke Yan, and Shouhong Ding. Lareˆ 2: Latent reconstruction error based method for diffusion-generated image detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17006–17015, 2024
work page 2024
-
[38]
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. Glide: Towards photorealistic image generation and editing with text-guided diffusion models.arXiv preprint arXiv:2112.10741, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[39]
Deconvolution and checkerboard artifacts.Distill, 1 (10):e3, 2016
Augustus Odena, Vincent Dumoulin, and Chris Olah. Deconvolution and checkerboard artifacts.Distill, 1 (10):e3, 2016
work page 2016
-
[40]
Towards universal fake image detectors that generalize across generative models
Utkarsh Ojha, Yuheng Li, and Yong Jae Lee. Towards universal fake image detectors that generalize across generative models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 24480–24489, 2023
work page 2023
-
[41]
Semantic image synthesis with spatially- adaptive normalization
Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. Semantic image synthesis with spatially- adaptive normalization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2337–2346, 2019
work page 2019
-
[42]
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion models for high-resolution image synthesis.arXiv preprint arXiv:2307.01952, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[43]
Thinking in frequency: Face forgery detection by mining frequency-aware clues
Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. Thinking in frequency: Face forgery detection by mining frequency-aware clues. InEuropean conference on computer vision, pages 86–103. Springer, 2020
work page 2020
-
[44]
Learning transferable visual models from natural language supervision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021
work page 2021
-
[45]
Anirudh Sundara Rajan, Utkarsh Ojha, Jedidiah Schloesser, and Yong Jae Lee. Aligned datasets improve detection of latent diffusion-generated images.arXiv preprint arXiv:2410.11835, 2024
-
[46]
High-resolution image synthesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022
work page 2022
-
[47]
Stylegan-xl: Scaling stylegan to large diverse datasets
Axel Sauer, Katja Schwarz, and Andreas Geiger. Stylegan-xl: Scaling stylegan to large diverse datasets. In ACM SIGGRAPH 2022 conference proceedings, pages 1–10, 2022
work page 2022
-
[48]
Grad-cam: Visual explanations from deep networks via gradient-based localization
Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017
work page 2017
-
[49]
De-fake: Detection and attribution of fake images generated by text-to-image generation models
Zeyang Sha, Zheng Li, Ning Yu, and Yang Zhang. De-fake: Detection and attribution of fake images generated by text-to-image generation models. InProceedings of the 2023 ACM SIGSAC conference on computer and communications security, pages 3418–3432, 2023
work page 2023
-
[50]
Oriane Siméoni, Huy V V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Michaël Ramamonjisoa, et al. Dinov3.arXiv preprint arXiv:2508.10104, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[51]
Denoising Diffusion Implicit Models
Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models.arXiv preprint arXiv:2010.02502, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[52]
Learning on gradients: Generalized artifacts representation for gan-generated images detection
Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, and Yunchao Wei. Learning on gradients: Generalized artifacts representation for gan-generated images detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12105–12114, 2023. 12
work page 2023
-
[53]
Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, and Yunchao Wei. Frequency-aware deepfake detection: Improving generalizability through frequency space domain learning. InProceedings of the AAAI Conference on Artificial Intelligence, pages 5052–5060, 2024
work page 2024
-
[54]
Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, and Yunchao Wei. Rethinking the up-sampling operations in cnn-based generative network for generalizable deepfake detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 28130–28139, 2024
work page 2024
-
[55]
C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection
Chuangchuang Tan, Renshuai Tao, Huan Liu, Guanghua Gu, Baoyuan Wu, Yao Zhao, and Yunchao Wei. C2p-clip: Injecting category common prompt in clip to enhance generalization in deepfake detection. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 7184–7192, 2025
work page 2025
-
[56]
Midjourney v6.1.https://www.midjourney.com/home,
Midjourney Team. Midjourney v6.1.https://www.midjourney.com/home, . 2024
work page 2024
-
[57]
Dall-e 3 ai image generator.https://dalle3.ai/,
OpenAI Team. Dall-e 3 ai image generator.https://dalle3.ai/, . 2024
work page 2024
-
[58]
Pixel recurrent neural networks
Aäron Van Den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. Pixel recurrent neural networks. In International conference on machine learning, pages 1747–1756. PMLR, 2016
work page 2016
-
[59]
Cnn-generated images are surprisingly easy to spot
Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A Efros. Cnn-generated images are surprisingly easy to spot... for now. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8695–8704, 2020
work page 2020
-
[60]
Dire for diffusion-generated image detection
Zhendong Wang, Jianmin Bao, Wengang Zhou, Weilun Wang, Hezhen Hu, Hong Chen, and Houqiang Li. Dire for diffusion-generated image detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 22445–22455, 2023
work page 2023
-
[61]
A sanity check for ai-generated image detection.arXiv preprint arXiv:2406.19435, 2024
Shilin Yan, Ouxiang Li, Jiayin Cai, Yanbin Hao, Xiaolong Jiang, Yao Hu, and Weidi Xie. A sanity check for ai-generated image detection.arXiv preprint arXiv:2406.19435, 2024
-
[62]
Zhiyuan Yan, Jiangming Wang, Peng Jin, Ke-Yue Zhang, Chengchun Liu, Shen Chen, Taiping Yao, Shouhong Ding, Baoyuan Wu, and Li Yuan. Orthogonal subspace decomposition for generalizable ai-generated image detection.arXiv preprint arXiv:2411.15633, 2024
-
[63]
Xiao Yu, Kejiang Chen, Kai Zeng, Han Fang, Zijin Yang, Xiuwei Shang, Yuang Qi, Weiming Zhang, and Nenghai Yu. Semgir: Semantic-guided image regeneration based method for ai-generated image detection and attribution. InProceedings of the 32nd ACM International Conference on Multimedia, pages 8480–8488, 2024
work page 2024
-
[64]
Styleswin: Transformer-based gan for high-resolution image generation
Bowen Zhang, Shuyang Gu, Bo Zhang, Jianmin Bao, Dong Chen, Fang Wen, Yong Wang, and Baining Guo. Styleswin: Transformer-based gan for high-resolution image generation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11304–11314, 2022
work page 2022
-
[65]
arXiv preprint arXiv:2311.12397 , year=
Nan Zhong, Yiran Xu, Sheng Li, Zhenxing Qian, and Xinpeng Zhang. Patchcraft: Exploring texture patch for efficient ai-generated image detection.arXiv preprint arXiv:2311.12397, 2023
-
[66]
Simplicity Prevails: The Emergence of Generalizable AIGI Detection in Visual Foundation Models
Yue Zhou, Xinan He, Kaiqing Lin, Bing Fan, Feng Ding, and Bin Li. Simplicity prevails: The emergence of generalizable aigi detection in visual foundation models.arXiv preprint arXiv:2602.01738, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[67]
Unpaired image-to-image translation using cycle-consistent adversarial networks
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. InProceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017
work page 2017
-
[68]
Mingjian Zhu, Hanting Chen, Qiangyu Yan, Xudong Huang, Guanyu Lin, Wei Li, Zhijun Tu, Hailin Hu, Jie Hu, and Yunhe Wang. Genimage: A million-scale benchmark for detecting ai-generated image.Advances in neural information processing systems, 36:77771–77782, 2023. 13 A Technical Appendices and Supplementary Material This appendix provides supplementary to s...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.