pith. machine review for the scientific record. sign in

arxiv: 2604.11487 · v1 · submitted 2026-04-13 · 💻 cs.CV

Recognition: unknown

NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild

Aashish Negi, Akshay Dudhane, Aleksandr Gushchin, Amit Shukla, Anas M. Ali, Anastasia Antsiferova, Artem Filippov, Bilel Benjdira, Changsheng Chen, Chenfan Qu, Chuanbiao Song, Cong Luo, Dagong Lu, Dmitry Vatolin, Ekaterina Shumitskaya, Fei Wu, Fengjun Guo, Fengpeng Li, Feng Xu, Georgii Bychkov, Haiwei Wu, Haodong Ren, Hao Sun, Hao Tan, Hardik Sharma, Hong Vin Koay, Jayant Kumar, Jiamin Zhang, Jianwei Fei, Jia Wen Seow, Junchi Li, Jun Lan, Kemou Li, Khaled Abud, Mikhail Erofeev, Mufeng Yao, Praful Hambarde, Prateek Shaily, Qi Zhang, Radu Timofte, Ruiyang Xia, Sachin Chaudhary, Sergey Lavrushkin, Shuai Chen, Shunquan Tan, Wadii Boulila, Xinlei Xu, Yaowen Xu, Yongwei Tang, Zhaofan Zou, Zhilin Tu, Zhiqiang Wu, Zhiqiang Yang, Zijian Yu

Pith reviewed 2026-05-10 16:38 UTC · model grok-4.3

classification 💻 cs.CV
keywords AI-generated image detectionrobustness to transformationsbenchmark datasetNTIRE challengecomputer visionimage forensics
0
0 comments X

The pith

The NTIRE 2026 challenge supplies a dataset of 294500 images from 42 generators plus 36 transformations to test whether AI-image detectors stay reliable after everyday edits.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper sets up and reports on a public competition whose goal is to create detectors that distinguish real photographs from AI-generated images even after the images have been cropped, resized, compressed, blurred or otherwise altered for normal use. The organizers built a new dataset containing 108750 real images and 185750 generated images produced by 42 different models that range from open-source to closed-source systems. Every image in the test portion receives one or more of 36 standard transformations so that performance can be measured on both pristine and edited versions. Scoring uses ROC AUC across the entire test collection, and the report summarizes the twenty valid submissions received from 511 registered teams.

Core claim

The challenge rests on the claim that a single large collection spanning many generator architectures and a broad set of realistic post-processing steps supplies a practical testbed for measuring and improving the robustness of AI-generated-image detectors under conditions that occur in everyday distribution.

What carries the argument

The novel dataset of 108750 real plus 185750 AI-generated images drawn from 42 generators and then augmented by 36 image transformations.

Load-bearing premise

The 36 selected transformations together with the 42 chosen generators capture the range of edits and model behaviors that actually appear when images circulate online.

What would settle it

A detector that scores high on the challenge test set but drops sharply when evaluated on a fresh collection of generators or transformations outside the 42-plus-36 set.

Figures

Figures reproduced from arXiv: 2604.11487 by Aashish Negi, Akshay Dudhane, Aleksandr Gushchin, Amit Shukla, Anas M. Ali, Anastasia Antsiferova, Artem Filippov, Bilel Benjdira, Changsheng Chen, Chenfan Qu, Chuanbiao Song, Cong Luo, Dagong Lu, Dmitry Vatolin, Ekaterina Shumitskaya, Fei Wu, Fengjun Guo, Fengpeng Li, Feng Xu, Georgii Bychkov, Haiwei Wu, Haodong Ren, Hao Sun, Hao Tan, Hardik Sharma, Hong Vin Koay, Jayant Kumar, Jiamin Zhang, Jianwei Fei, Jia Wen Seow, Junchi Li, Jun Lan, Kemou Li, Khaled Abud, Mikhail Erofeev, Mufeng Yao, Praful Hambarde, Prateek Shaily, Qi Zhang, Radu Timofte, Ruiyang Xia, Sachin Chaudhary, Sergey Lavrushkin, Shuai Chen, Shunquan Tan, Wadii Boulila, Xinlei Xu, Yaowen Xu, Yongwei Tang, Zhaofan Zou, Zhilin Tu, Zhiqiang Wu, Zhiqiang Yang, Zijian Yu.

Figure 1
Figure 1. Figure 1: MICV method scheme. DINOv3-based detection frame [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Ant International method scheme. 3.2. Ant International 3.2.1. Introduction The rapid progress of text-to-image (T2I) generation has made synthetic images increasingly photorealistic, creating a pressing need for robust AI-generated image detection. Our solution for the NTIRE 2026 Robust AI-Generated Im￾age Detection Challenge is guided by three principles: • Massive and diverse data: large-scale multi-sou… view at source ↗
Figure 3
Figure 3. Figure 3: TeleAI-TeleGuard method scheme. Pipeline of LPT. To achieve robust detection, the framework includes distortion simu￾lation, foundation-model fine-tuning, and pairwise optimization. M1: DINOv3-Huge M2: DINOv3-Huge M3: DINOv3-Huge M4: DINOv3-Huge M5: MetaClip2-Giant 𝜆1 𝜆2 𝜆3 DINO Base Branch High Resolution Hetero- geneous 𝜆4 Three-Route Ensemble 𝜆5 Data expansion Augmentation Escalation Weighted Logits Fus… view at source ↗
Figure 4
Figure 4. Figure 4: INTSIG method scheme. high-resolution branch (448 × 448). Model 5 replaces the backbone with MetaCLIP2 Giant and applies partial fine￾tuning. This design balances in-domain accuracy and OOD robustness. Model 1: Baseline. Model 1 uses DINOv3-Huge with an MLP head: 1280 \rightarrow 256 \xrightarrow {\mathrm {ReLU}+\mathrm {Dropout}(0.1)} 2. (2) We use AdamW (betas (0.9, 0.999)) with separate parame￾ter group… view at source ↗
Figure 5
Figure 5. Figure 5: Vincentlc method scheme. the fused output has the opposite direction. We then shift the final logit by 2.5 toward the M4/M5 direction. Gate-2 (anomaly suppression) is triggered when at least 3 models in {M1, M2, M3, M5} agree while M4 disagrees; M4 is excluded and remaining weights are renormalized. 3.4.3. Engineering Optimization Parallel inference. Inference runs with one CUDA stream per model, CPU-side … view at source ↗
Figure 6
Figure 6. Figure 6: Reagvis Labs method scheme. Overview of RAPID with [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: UESTC method scheme. 8.1.5. Runtime and Memory [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: PSU method scheme. compression, blur, noise, rescaling, and cropping. Our core hypothesis is that no single pre-training objective encodes the full forensic artifact manifold [14, 78]: contrastive ob￾jectives capture semantic inconsistencies; self-supervised patch objectives preserve texture discontinuities; supervised CNNs encode low-frequency spectral anomalies. Robust detection therefore requires explic… view at source ↗
Figure 9
Figure 9. Figure 9: Shallow Real method scheme. Overall pipeline of the [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗
read the original abstract

This paper presents an overview of the NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild, held in conjunction with the NTIRE workshop at CVPR 2026. The goal of this challenge was to develop detection models capable of distinguishing real images from generated ones in realistic scenarios: the images are often transformed (cropped, resized, compressed, blurred) for practical usage, and therefore, the detection models should be robust to such transformations. The challenge is based on a novel dataset consisting of 108,750 real and 185,750 AI-generated images from 42 generators comprising a large variety of open-source and closed-source models of various architectures, augmented with 36 image transformations. Methods were evaluated using ROC AUC on the full test set, including both transformed and untransformed images. A total of 511 participants registered, with 20 teams submitting valid final solutions. This report provides a comprehensive overview of the challenge, describes the proposed solutions, and can be used as a valuable reference for researchers and practitioners in increasing the robustness of the detection models to real-world transformations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript presents an overview of the NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild, held at CVPR 2026. It describes the goal of developing detectors robust to real-world image transformations, introduces a novel dataset of 108,750 real and 185,750 AI-generated images sourced from 42 generators (open- and closed-source, various architectures), augmented with 36 transformations, and reports evaluation via ROC AUC on the full test set (transformed and untransformed images). Participation is noted as 511 registrants with 20 valid submissions; the report also summarizes participant solutions.

Significance. If the dataset construction supports realistic robustness testing, this work provides a valuable large-scale benchmark that extends prior efforts by incorporating diverse generators and post-processing transformations, encouraging development of detectors that generalize to practical usage. The scale (nearly 300k images) and inclusion of closed-source models are notable strengths that could serve as a reference for the community.

major comments (1)
  1. [Abstract and dataset description] The abstract and dataset description assert that the 36 transformations (cropping, resizing, compression, blurring, etc.) and 42 generators reflect 'realistic scenarios' and 'practical usage' for 'in the Wild' detection, yet no quantitative validation—such as parameter histograms, coverage analysis against real-world corpora, or generator popularity statistics—is provided to support this selection. This is load-bearing for the central robustness claim.
minor comments (2)
  1. [Dataset construction] The manuscript could clarify in the methods or dataset section how the specific 36 transformations were selected and parameterized (e.g., ranges for compression quality or blur kernels) to aid reproducibility.
  2. [Results and solutions overview] Participation and submission numbers are clearly stated, but a brief table summarizing top-performing methods' key ideas (e.g., architectures or augmentation strategies) would improve readability without altering the descriptive nature of the report.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment and the recommendation for minor revision. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract and dataset description] The abstract and dataset description assert that the 36 transformations (cropping, resizing, compression, blurring, etc.) and 42 generators reflect 'realistic scenarios' and 'practical usage' for 'in the Wild' detection, yet no quantitative validation—such as parameter histograms, coverage analysis against real-world corpora, or generator popularity statistics—is provided to support this selection. This is load-bearing for the central robustness claim.

    Authors: We agree that the manuscript would be strengthened by additional justification for the selection of transformations and generators. These were chosen by the organizers to reflect common real-world post-processing (e.g., JPEG compression and resizing typical of social media uploads) and a representative mix of current open- and closed-source generators. The current text does not contain the quantitative analyses mentioned. In the revised version we will expand the dataset description with a new paragraph providing the rationale, supported by citations to prior work on real-world image degradations and generator prevalence. We will also note limitations in coverage. Full parameter histograms or large-scale corpus coverage analysis would require new data collection outside the scope of this challenge overview paper, but the added textual justification will directly address the load-bearing aspect of the robustness claim. revision: partial

Circularity Check

0 steps flagged

No circularity: purely descriptive competition report

full rationale

The paper is an overview of a challenge organization and dataset construction. It states factual counts (108750 real + 185750 generated images, 42 generators, 36 transformations) and evaluation protocol (ROC AUC) without equations, derivations, predictions, fitted parameters, or load-bearing self-citations. No step reduces a claimed result to its own inputs by construction; the content is self-contained as a descriptive report of an external competition setup.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The report relies on standard computer vision challenge practices and does not introduce or depend on new free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5726 in / 932 out tokens · 35081 ms · 2026-05-10T16:38:44.857607+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

93 extracted references · 14 canonical work pages · 7 internal anchors

  1. [1]

    Arniqa: Learning distortion mani- fold for image quality assessment

    Lorenzo Agnolucci, Leonardo Galteri, Marco Bertini, and Alberto Del Bimbo. Arniqa: Learning distortion mani- fold for image quality assessment. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 189–198, 2024. 3

  2. [2]

    NT-HAZE: A Benchmark Dataset for Re- alistic Night-time Image Dehazing

    Radu Ancuti, Codruta Ancuti, Radu Timofte, and Cos- min Ancuti. NT-HAZE: A Benchmark Dataset for Re- alistic Night-time Image Dehazing . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  3. [3]

    NTIRE 2026 Nighttime Image Dehazing Challenge Report

    Radu Ancuti, Alexandru Brateanu, Florin Vasluianu, Raul Balmez, Ciprian Orhei, Codruta Ancuti, Radu Timofte, Cos- min Ancuti, et al. NTIRE 2026 Nighttime Image Dehazing Challenge Report . InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  4. [4]

    The jpeg ai standard: Providing efficient human and machine vi- sual data consumption.IEEE MultiMedia, 30(1):100–111,

    Jo ˜ao Ascenso, Elena Alshina, and Touradj Ebrahimi. The jpeg ai standard: Providing efficient human and machine vi- sual data consumption.IEEE MultiMedia, 30(1):100–111,

  5. [5]

    BEiT: BERT Pre-Training of Image Transformers

    Hangbo Bao, Li Dong, and Furu Wei. Beit: Bert pre-training of image transformers.arXiv preprint arXiv:2106.08254,

  6. [6]

    NTIRE 2026 Challenge on Single Image Re- flection Removal in the Wild: Datasets, Results, and Meth- ods

    Jie Cai, Kangning Yang, Zhiyuan Li, Florin Vasluianu, Radu Timofte, et al. NTIRE 2026 Challenge on Single Image Re- flection Removal in the Wild: Datasets, Results, and Meth- ods . InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR) Workshops,

  7. [7]

    Hidream-i1: A high-efficient image generative foundation model with sparse diffusion transformer.arXiv preprint arXiv:2505.22705, 2025

    Qi Cai, Jingwen Chen, Yang Chen, Yehao Li, Fuchen Long, Yingwei Pan, Zhaofan Qiu, Yiheng Zhang, Fengbin Gao, Peihan Xu, et al. Hidream-i1: A high-efficient image gen- erative foundation model with sparse diffusion transformer. arXiv preprint arXiv:2505.22705, 2025. 2

  8. [8]

    Emerg- ing properties in self-supervised vision transformers

    Mathilde Caron, Hugo Touvron, Ishan Misra, Herv ´e J´egou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerg- ing properties in self-supervised vision transformers. In ICCV, pages 9650–9660, 2021. 11

  9. [9]

    Conceptual 12m: Pushing web-scale image-text pre- training to recognize long-tail visual concepts

    Soravit Changpinyo, Piyush Sharma, Nan Ding, and Radu Soricut. Conceptual 12m: Pushing web-scale image-text pre- training to recognize long-tail visual concepts. InProceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3558–3568, 2021. 2

  10. [10]

    The Fourth Challenge on Image Super-Resolution (×4) at NTIRE 2026: Benchmark Results and Method Overview

    Zheng Chen, Kai Liu, Jingkai Wang, Xianglong Yan, Jianze Li, Ziqing Zhang, Jue Gong, Jiatong Li, Lei Sun, Xi- aoyang Liu, Radu Timofte, Yulun Zhang, et al. The Fourth Challenge on Image Super-Resolution (×4) at NTIRE 2026: Benchmark Results and Method Overview . InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR) W...

  11. [11]

    Learned image compression with discretized gaussian mixture likelihoods and attention modules

    Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. Learned image compression with discretized gaussian mixture likelihoods and attention modules. InProceedings of CVPR, pages 7939–7948, 2020. 3

  12. [12]

    Low Light Image Enhancement Challenge at NTIRE 2026

    George Ciubotariu, Sharif S M A, Abdur Rehman, Fayaz Ali Dharejo, Rizwan Ali Naqvi, Marcos Conde, Radu Tim- ofte, et al. Low Light Image Enhancement Challenge at NTIRE 2026 . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 2

  13. [13]

    High FPS Video Frame Interpolation Challenge at NTIRE 2026

    George Ciubotariu, Zhuyun Zhou, Yeying Jin, Zongwei Wu, Radu Timofte, et al. High FPS Video Frame Interpolation Challenge at NTIRE 2026 . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  14. [14]

    On the detection of synthetic images generated by diffusion models

    Riccardo Corvi, Davide Cozzolino, Guido Zingarini, Gio- vanni Poggi, Kazuhiro Nagano, and Luisa Verdoliva. On the detection of synthetic images generated by diffusion models. InICASSP, pages 1–5, 2023. 12

  15. [15]

    Redcaps: Web-curated image-text data created by the people, for the people.arXiv preprint arXiv:2111.11431, 2021

    Karan Desai, Gaurav Kaul, Zubin Aysola, and Justin John- son. Redcaps: Web-curated image-text data created by the people, for the people.arXiv preprint arXiv:2111.11431,

  16. [16]

    NTIRE 2026 Rip Current Detection and Segmentation (RipDetSeg) Chal- lenge Report

    Andrei Dumitriu, Aakash Ralhan, Florin Miron, Florin Ta- tui, Radu Tudor Ionescu, Radu Timofte, et al. NTIRE 2026 Rip Current Detection and Segmentation (RipDetSeg) Chal- lenge Report . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 2

  17. [17]

    Conde, Zongwei Wu, Yeying Jin, Radu Timofte, et al

    Omar Elezabi, Marcos V . Conde, Zongwei Wu, Yeying Jin, Radu Timofte, et al. Photography Retouching Trans- fer, NTIRE 2026 Challenge: Report . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  18. [18]

    Eva-02: A visual repre- sentation for neon genesis

    Yuxin Fang, Wenhai Wang, Binhui Xie, Quan Sun, Ledell Wu, Xinggang Wang, Yue Cao, et al. Eva-02: A visual repre- sentation for neon genesis. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 10, 12

  19. [19]

    Leveraging fre- quency analysis for deep fake image recognition

    Joel Frank, Thorsten Eisenhofer, Lea Sch ¨onherr, Asja Fis- cher, Dorothea Kolossa, and Thorsten Holz. Leveraging fre- quency analysis for deep fake image recognition. InInter- national conference on machine learning, pages 3247–3258. PMLR, 2020. 1

  20. [20]

    Rich models for ste- ganalysis of digital images.IEEE Transactions on Informa- tion Forensics and Security, 7(3):868–882, 2012

    Jessica Fridrich and Jan Kodovsk ´y. Rich models for ste- ganalysis of digital images.IEEE Transactions on Informa- tion Forensics and Security, 7(3):868–882, 2012. 10

  21. [21]

    Dat- acomp: In search of the next generation of multimodal datasets.Advances in Neural Information Processing Sys- tems, 36:27092–27112, 2023

    Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, et al. Dat- acomp: In search of the next generation of multimodal datasets.Advances in Neural Information Processing Sys- tems, 36:27092–27112, 2023. 2

  22. [22]

    Generative adversarial networks.Commu- nications of the ACM, 63(11):139–144, 2020

    Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks.Commu- nications of the ACM, 63(11):139–144, 2020. 1

  23. [23]

    Google. Siglip 2: Multilingual vision-language encoders with improved semantic understanding, localization, and dense features.https://huggingface.co/google/ siglip2- giant- opt- patch16- 384, 2025. Model card / release page, accessed: 2026-03-20. 8

  24. [24]

    Are gan generated im- ages easy to detect? a critical analysis of the state-of-the-art

    Diego Gragnaniello, Davide Cozzolino, Francesco Marra, Giovanni Poggi, and Luisa Verdoliva. Are gan generated im- ages easy to detect? a critical analysis of the state-of-the-art. InICME, pages 1–6, 2021. 12

  25. [25]

    NTIRE 2026 Challenge on End-to-End Financial Receipt Restoration and Reasoning from Degraded Images: Datasets, Methods and Results

    Bochen Guan, Jinlong Li, Kangning Yang, Chuang Ke, Jie Cai, Florin Vasluianu, Radu Timofte, et al. NTIRE 2026 Challenge on End-to-End Financial Receipt Restoration and Reasoning from Degraded Images: Datasets, Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 2

  26. [26]

    NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: AI Flash Portrait (Track 3)

    Ya-nan Guan, Shaonan Zhang, Hang Guo, Yawen Wang, Xinying Fan, Jie Liang, Hui Zeng, Guanyi Qin, Lishen Qu, Tao Dai, Shu-Tao Xia, Lei Zhang, Radu Timofte, et al. NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: AI Flash Portrait (Track 3) . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  27. [27]

    NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild

    Aleksandr Gushchin, Khaled Abud, Ekaterina Shumitskaya, Artem Filippov, Georgii Bychkov, Sergey Lavrushkin, Mikhail Erofeev, Anastasia Antsiferova, Changsheng Chen, Shunquan Tan, Radu Timofte, Dmitriy Vatolin, et al. NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild . InProceedings of the IEEE/CVF Conference on Computer Vision and Pa...

  28. [28]

    Momentum contrast for unsupervised visual rep- resentation learning

    Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual rep- resentation learning. InCVPR, pages 9729–9738, 2020. 11

  29. [29]

    Denoising dif- fusion probabilistic models

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising dif- fusion probabilistic models. InNeurIPS, pages 6840–6851,

  30. [30]

    Wildfake: A large- scale and hierarchical dataset for ai-generated images detec- tion

    Yan Hong, Jianming Feng, Haoxing Chen, Jun Lan, Huijia Zhu, Weiqiang Wang, and Jianfu Zhang. Wildfake: A large- scale and hierarchical dataset for ai-generated images detec- tion. InProceedings of the AAAI Conference on Artificial Intelligence, pages 3500–3508, 2025. 1

  31. [31]

    Robust Deepfake De- tection, NTIRE 2026 Challenge: Report

    Benedikt Hopf, Radu Timofte, et al. Robust Deepfake De- tection, NTIRE 2026 Challenge: Report . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  32. [32]

    Lora: Low-rank adaptation of large language mod- els

    Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen- Zhu, Yuanzhi Li, Shean Wang, Liang Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language mod- els. InInternational Conference on Learning Representa- tions (ICLR), 2022. 6, 10, 12

  33. [33]

    So-fake: Benchmarking and explain- ing social media image forgery detection.arXiv preprint arXiv:2505.18660, 2025

    Zhenglin Huang, Tianxiao Li, Xiangtai Li, Haiquan Wen, Yiwei He, Jiangning Zhang, Hao Fei, Xi Yang, Xiaowei Huang, Bei Peng, et al. So-fake: Benchmarking and explain- ing social media image forgery detection.arXiv preprint arXiv:2505.18660, 2025. 6

  34. [34]

    NTIRE 2026 Low-light Enhancement: Twilight Cowboy Challenge

    Aleksei Khalin, Egor Ershov, Artem Panshin, Sergey Ko- rchagin, Georgiy Lobarev, Arseniy Terekhin, Sofiia Doro- gova, Amir Shamsutdinov, Yasin Mamedov, Bakhtiyar Khalfin, Bogdan Sheludko, Emil Zilyaev, Nikola Bani ´c, Georgy Perevozchikov, Radu Timofte, et al. NTIRE 2026 Low-light Enhancement: Twilight Cowboy Challenge . In Proceedings of the IEEE/CVF Con...

  35. [35]

    Supervised contrastive learning

    Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. Supervised contrastive learning. InNeurIPS, pages 18661–18673, 2020. 13

  36. [36]

    Black Forest Labs, Stephen Batifol, Andreas Blattmann, Frederic Boesel, Saksham Consul, Cyril Diagne, Tim Dock- horn, Jack English, Zion English, Patrick Esser, et al. Flux. 1 kontext: Flow matching for in-context image generation and editing in latent space.arXiv preprint arXiv:2506.15742,

  37. [37]

    Bridging the gap between ideal and real-world evaluation: Benchmarking ai-generated image detection in challenging scenarios

    Chunxiao Li, Xiaoxiao Wang, Meiling Li, Boming Miao, Peng Sun, Yunjian Zhang, Xiangyang Ji, and Yao Zhu. Bridging the gap between ideal and real-world evaluation: Benchmarking ai-generated image detection in challenging scenarios. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 20379–20389, 2025. 1, 7

  38. [38]

    The First Challenge on Mobile Real-World Image Super- Resolution at NTIRE 2026: Benchmark Results and Method Overview

    Jiatong Li, Zheng Chen, Kai Liu, Jingkai Wang, Zihan Zhou, Xiaoyang Liu, Libo Zhu, Radu Timofte, Yulun Zhang, et al. The First Challenge on Mobile Real-World Image Super- Resolution at NTIRE 2026: Benchmark Results and Method Overview . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 2

  39. [39]

    Autoregressive image generation without vec- tor quantization.Advances in Neural Information Processing Systems, 37:56424–56445, 2024

    Tianhong Li, Yonglong Tian, He Li, Mingyang Deng, and Kaiming He. Autoregressive image generation without vec- tor quantization.Advances in Neural Information Processing Systems, 37:56424–56445, 2024. 1

  40. [40]

    NTIRE 2026 Challenge on Short-form UGC Video Restoration in the Wild with Generative Models: Datasets, Methods and Results

    Xin Li, Jiachao Gong, Xijun Wang, Shiyao Xiong, Bingchen Li, Suhang Yao, Chao Zhou, Zhibo Chen, Radu Timofte, et al. NTIRE 2026 Challenge on Short-form UGC Video Restoration in the Wild with Generative Models: Datasets, Methods and Results . InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  41. [41]

    NTIRE 2026 The Second Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results

    Xin Li, Yeying Jin, Suhang Yao, Beibei Lin, Zhaoxin Fan, Wending Yan, Xin Jin, Zongwei Wu, Bingchen Li, Peishu Shi, Yufei Yang, Yu Li, Zhibo Chen, Bihan Wen, Robby Tan, Radu Timofte, et al. NTIRE 2026 The Second Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results . InProceedings of the IEEE/CVF Con- ference on Computer...

  42. [42]

    The First Chal- lenge on Remote Sensing Infrared Image Super-Resolution at NTIRE 2026: Benchmark Results and Method Overview

    Kai Liu, Haoyang Yue, Zeli Lin, Zheng Chen, Jingkai Wang, Jue Gong, Radu Timofte, Yulun Zhang, et al. The First Chal- lenge on Remote Sensing Infrared Image Super-Resolution at NTIRE 2026: Benchmark Results and Method Overview . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  43. [43]

    Conde, et al

    Shuhong Liu, Ziteng Cui, Chenyu Bao, Xuangeng Chu, Lin Gu, Bin Ren, Radu Timofte, Marcos V . Conde, et al. 3D Restoration and Reconstruction in Adverse Conditions: Re- alX3D Challenge Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  44. [44]

    NTIRE 2026 X- AIGC Quality Assessment Challenge: Methods and Results

    Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Qiang Hu, Jiezhang Cao, Yu Zhou, Wei Sun, Farong Wen, Zitong Xu, Yingjie Zhou, Huiyu Duan, Lu Liu, Jiarui Wang, Siqi Luo, Chunyi Li, Li Xu, Zicheng Zhang, Yue Shi, Yubo Wang, Minghong Zhang, Chunchao Guo, Zhichao Hu, Mingtao Chen, Xiele Wu, Xin Ma, Zhaohe Lv, Yuanhao Xue, Jiaqi Wang, Xinxing Sha, Radu Timofte, et...

  45. [45]

    Swin transformer: Hierarchical vision transformer using shifted windows

    Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, pages 10012–10022, 2021. 11

  46. [46]

    A convnet for the 2020s

    Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feicht- enhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. InCVPR, pages 11976–11986, 2022. 11, 12

  47. [47]

    SGDR: Stochastic Gradient Descent with Warm Restarts

    Ilya Loshchilov and Frank Hutter. Sgdr: Stochas- tic gradient descent with warm restarts.arXiv preprint arXiv:1608.03983, 2016. 6

  48. [48]

    Decoupled Weight Decay Regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017. 6, 13

  49. [49]

    NTIRE 2026 Challenge on Video Saliency Predic- tion: Methods and Results

    Andrey Moskalenko, Alexey Bryncev, Ivan Kosmynin, Kira Shilovskaya, Mikhail Erofeev, Dmitry Vatolin, Radu Timo- fte, et al. NTIRE 2026 Challenge on Video Saliency Predic- tion: Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  50. [50]

    If image generation model from deepfloyd lab

    none. If image generation model from deepfloyd lab. https : / / github . com / deep - floyd / IF, . Ac- cessed: 2026-03-22. 2

  51. [51]

    Grok imagine image generation model from xai

    none. Grok imagine image generation model from xai. https://grok.com/imagine, . Accessed: 2026-03-

  52. [52]

    Nano banana image generation model from google.https://blog.google/products- and- platforms/products/gemini/updated-image- editing-model/,

    none. Nano banana image generation model from google.https://blog.google/products- and- platforms/products/gemini/updated-image- editing-model/, . Accessed: 2026-03-22. 3

  53. [53]

    Nano banana 2 image generation model from google

    none. Nano banana 2 image generation model from google. https://blog.google/innovation- and- ai/ technology / ai / nano - banana - 2/, . Accessed: 2026-03-22. 2

  54. [54]

    Seedream 5 lite image generation model from bytedance.https://seed.bytedance.com/en/ seedream5_0_lite,

    none. Seedream 5 lite image generation model from bytedance.https://seed.bytedance.com/en/ seedream5_0_lite, . Accessed: 2026-03-22. 2

  55. [55]

    Towards uni- versal fake image detectors that generalize across genera- tive models

    Utkarsh Ojha, Yuheng Li, and Yong Jae Lee. Towards uni- versal fake image detectors that generalize across genera- tive models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 24480– 24489, 2023. 1, 12

  56. [56]

    Maxime Oquab, Timoth ´ee Darcet, Th´eo Moutakanni, Huy V . V o, Wiktor Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaa El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rab- bat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Herv ´e J´egou, Julien Mairal, P...

  57. [57]

    NTIRE 2026 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results

    Hyunhee Park, Eunpil Park, Sangmin Lee, Radu Timofte, et al. NTIRE 2026 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results . InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  58. [58]

    NTIRE 2026 Challenge on Learned Smartphone ISP with Unpaired Data: Methods and Results

    Georgy Perevozchikov, Daniil Vladimirov, Radu Timofte, et al. NTIRE 2026 Challenge on Learned Smartphone ISP with Unpaired Data: Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR) Workshops, 2026. 2

  59. [59]

    NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: Professional Image Quality Assessment (Track 1)

    Guanyi Qin, Jie Liang, Bingbing Zhang, Lishen Qu, Ya-nan Guan, Hui Zeng, Lei Zhang, Radu Timofte, et al. NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: Professional Image Quality Assessment (Track 1) . InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  60. [60]

    The Second Challenge on Cross-Domain Few-Shot Object Detection at NTIRE 2026: Methods and Results

    Xingyu Qiu, Yuqian Fu, Jiawei Geng, Bin Ren, Jiancheng Pan, Zongwei Wu, Hao Tang, Yanwei Fu, Radu Timo- fte, Nicu Sebe, Mohamed Elhoseiny, et al. The Second Challenge on Cross-Domain Few-Shot Object Detection at NTIRE 2026: Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  61. [61]

    NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: Multi-Exposure Image Fusion in Dynamic Scenes (Track2)

    Lishen Qu, Yao Liu, Jie Liang, Hui Zeng, Wen Dai, Ya-nan Guan, Guanyi Qin, Shihao Zhou, Jufeng Yang, Lei Zhang, Radu Timofte, et al. NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: Multi-Exposure Image Fusion in Dynamic Scenes (Track2) . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  62. [62]

    Learning transferable vi- sual models from natural language supervision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable vi- sual models from natural language supervision. InInter- national Conference on Machine Learning (ICML), pages 8748–8763, 2021. 10, 11, 12

  63. [63]

    The Eleventh NTIRE 2026 Efficient Super-Resolution Challenge Report

    Bin Ren, Hang Guo, Yan Shu, Jiaqi Ma, Ziteng Cui, Shuhong Liu, Guofeng Mei, Lei Sun, Zongwei Wu, Fahad Shahbaz Khan, Salman Khan, Radu Timofte, Yawei Li, et al. The Eleventh NTIRE 2026 Efficient Super-Resolution Challenge Report . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 2

  64. [64]

    High-resolution image synthesis with latent diffusion models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022. 1, 2, 11

  65. [65]

    Conde, Jeffrey Chen, Zhuyun Zhou, Zongwei Wu, Radu Timofte, et al

    Tim Seizinger, Florin-Alexandru Vasluianu, Marcos V . Conde, Jeffrey Chen, Zhuyun Zhou, Zongwei Wu, Radu Timofte, et al. The First Controllable Bokeh Rendering Challenge at NTIRE 2026 . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  66. [66]

    Transferable black-box one-shot forging of watermarks via image preference mod- els

    Tom ´aˇs Souˇcek, Sylvestre-Alvise Rebuffi, Pierre Fernandez, Nikola Jovanovi ´c, Hady Elsahar, Valeriu Lacatusu, Tuan Tran, and Alexandre Mourachko. Transferable black-box one-shot forging of watermarks via image preference mod- els. InAdvances in Neural Information Processing Systems,

  67. [67]

    The Third Challenge on Image Denoising at NTIRE 2026: Methods and Results

    Lei Sun, Hang Guo, Bin Ren, Shaolin Su, Xian Wang, Danda Pani Paudel, Luc Van Gool, Radu Timofte, Yawei Li, et al. The Third Challenge on Image Denoising at NTIRE 2026: Methods and Results . InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  68. [68]

    The Second Challenge on Event-Based Image Deblurring at NTIRE 2026: Methods and Results

    Lei Sun, Weilun Li, Xian Wang, Zhendong Li, Letian Shi, Dannong Xu, Deheng Zhang, Mengshun Hu, Shuang Guo, Shaolin Su, Radu Timofte, Danda Pani Paudel, Luc Van Gool, et al. The Second Challenge on Event-Based Image Deblurring at NTIRE 2026: Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Wor...

  69. [69]

    NTIRE 2026 The First Challenge on Blind Computational Aberration Correction: Methods and Results

    Lei Sun, Xiaolong Qian, Qi Jiang, Xian Wang, Yao Gao, Kailun Yang, Kaiwei Wang, Radu Timofte, Danda Pani Paudel, Luc Van Gool, et al. NTIRE 2026 The First Challenge on Blind Computational Aberration Correction: Methods and Results . InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  70. [70]

    EVA-CLIP: Improved Training Techniques for CLIP at Scale

    Quan Sun, Yuxin Fang, Ledell Wu, Xinlong Wang, and Yue Cao. Eva-clip: Improved training techniques for clip at scale. arXiv preprint arXiv:2303.15389, 2023. 6

  71. [71]

    Rethinking the up-sampling op- erations in cnn-based generative network for generalizable deepfake detection

    Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, and Yunchao Wei. Rethinking the up-sampling op- erations in cnn-based generative network for generalizable deepfake detection. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 28130–28139, 2024. 1

  72. [72]

    Mingxing Tan and Quoc V . Le. Efficientnetv2: Smaller mod- els and faster training. InICML, pages 10096–10106, 2021. 12

  73. [73]

    Learning- Based Ambient Lighting Normalization: NTIRE 2026 Chal- lenge Results and Findings

    Florin-Alexandru Vasluianu, Tim Seizinger, Jeffrey Chen, Zhuyun Zhou, Zongwei Wu, Radu Timofte, et al. Learning- Based Ambient Lighting Normalization: NTIRE 2026 Chal- lenge Results and Findings . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2

  74. [74]

    Advances in Single- Image Shadow Removal: Results from the NTIRE 2026 Challenge

    Florin-Alexandru Vasluianu, Tim Seizinger, Zhuyun Zhou, Zongwei Wu, Radu Timofte, et al. Advances in Single- Image Shadow Removal: Results from the NTIRE 2026 Challenge . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 2

  75. [75]

    Ovis-image technical report.arXiv preprint arXiv:2511.22982, 2025

    Guo-Hua Wang, Liangfu Cao, Tianyu Cui, Minghao Fu, Xi- aohao Chen, Pengxin Zhan, Jianshan Zhao, Lan Li, Bowen Fu, Jiaqi Liu, and Qing-Guo Chen. Ovis-image technical report.arXiv preprint arXiv:2511.22982, 2025. 2

  76. [76]

    The Second Challenge on Real-World Face Restoration at NTIRE 2026: Methods and Results

    Jingkai Wang, Jue Gong, Zheng Chen, Kai Liu, Jiatong Li, Yulun Zhang, Radu Timofte, et al. The Second Challenge on Real-World Face Restoration at NTIRE 2026: Methods and Results . InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work- shops, 2026. 2

  77. [77]

    NTIRE 2026 Challenge on 3D Content Super-Resolution: Methods and Results

    Longguang Wang, Yulan Guo, Yingqian Wang, Juncheng Li, Sida Peng, Ye Zhang, Radu Timofte, Minglin Chen, Yi Wang, Qibin Hu, Wenjie Lei, et al. NTIRE 2026 Challenge on 3D Content Super-Resolution: Methods and Results . In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR) Workshops, 2026. 2

  78. [78]

    Cnn-generated images are surprisingly easy to spot

    Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A Efros. Cnn-generated images are surprisingly easy to spot... for now. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8695–8704, 2020. 1, 12

  79. [79]

    Understanding contrastive representation learning through alignment and uniformity on the hypersphere.ArXiv, abs/2005.10242, 2020

    Tongzhou Wang and Phillip Isola. Understanding contrastive representation learning through alignment and uniformity on the hypersphere.arXiv preprint arXiv:2005.10242, 2020. 12

  80. [80]

    NTIRE 2026 Challenge on Light Field Image Super-Resolution: Methods and Results

    Yingqian Wang, Zhengyu Liang, Fengyuan Zhang, Wending Zhao, Longguang Wang, Juncheng Li, Jungang Yang, Radu Timofte, Yulan Guo, et al. NTIRE 2026 Challenge on Light Field Image Super-Resolution: Methods and Results . In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR) Workshops, 2026. 2

Showing first 80 references.