{"total":20,"items":[{"citing_arxiv_id":"2605.30062","ref_index":72,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"FakeVLM-R1: Internalizing Physical Laws via CoT for Synthetic Image Detection","primary_cat":"cs.CV","submitted_at":"2026-05-28T15:13:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"FakeVLM-R1 combines GRPO reinforcement learning with critical-thinking CoT and a physics-annotated FakeClue++ dataset to reach claimed SOTA synthetic image detection while reducing over-rejection of real images.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.27348","ref_index":34,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"When Eyes Betray AI: Social Gaze Consistency as a Semantic Cue for AI-Generated Image Detection","primary_cat":"cs.CV","submitted_at":"2026-05-26T17:50:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Social gaze consistency between interacting people is proposed as a new semantic cue orthogonal to low-level artifacts for detecting AI-generated images, with reported accuracy gains on vision and vision-language models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.26421","ref_index":80,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"HydraPrompt: An Adaptive and Asymmetric Framework of Vision-Language Models for Synthetic Image Detection","primary_cat":"cs.CV","submitted_at":"2026-05-26T01:20:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"HydraPrompt uses an Asymmetric Prompt Adapter with fixed real prompts and adaptive fake prompts plus a Conditional Supervised Contrastive loss to achieve SOTA synthetic image detection on benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16080","ref_index":56,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ReAlign: Generalizable Image Forgery Detection via Reasoning-Aligned Representation","primary_cat":"cs.CV","submitted_at":"2026-05-15T15:43:44+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"ReAlign distills LLM-generated reasoning texts into a lightweight AIGI forgery detector via contrastive image-text alignment to improve generalization on complex forgeries.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.14486","ref_index":61,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Reduce the Artifacts Bias for More Generalizable AI-Generated Image Detection","primary_cat":"cs.CV","submitted_at":"2026-05-14T07:26:36+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SEF introduces GAN upsampling for diverse artifacts and expert fusion to reduce domain interference, yielding stronger generalization on 13 benchmarks for AI-generated image detection.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.09296","ref_index":49,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts","primary_cat":"cs.CV","submitted_at":"2026-05-10T03:44:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"MDMF detects AI-generated images by learning patch-level forensic signatures and quantifying their distributional discrepancies with MMD, yielding larger separation than global methods when micro-defects are present.","context_count":1,"top_context_role":"baseline","top_context_polarity":"baseline","context_text":"11LOTA [42]ICCV'2566.8465.7367.1866.6180.6888.3374.8584.4977.9578.0678.9687.9273.5583.9967.4176.1682.3490.1974.4280.16C2P-CLIP [38]AAAI'2572.1277.8869.0775.1090.0695.7248.6874.0499.8499.8885.8294.1994.3997.6982.2791.6098.9799.6082.3689.52SAFE [18]KDD'2565.5159.5264.7859.1291.4194.3687.4292.6493.0792.1190.8094.5790.1193.8588.8492.2894.4196.5385.1586.11AIDE [49]ICLR'2590.3290.9686.9688.0890.4494.9578.7787.9799.6299.6596.4698.2697.6298.8598.1999.1099.5099.7593.1095.29Effort [50]ICML'2588.2889.9683.7485.8994.1597.3084.1492.4799.9699.9694.4697.5694.2497.5295.5297.9399.8499.9392.7095.39F-ConV [56] NeurIPS'2592.7491.6588.5187.67 88.87 88.47 85.94 84.88 98.94 98.98 98.14 98.7298.5298.38 96.79 96.33 95.52 95.38 93."},{"citing_arxiv_id":"2605.07074","ref_index":7,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Decoupling Semantics and Fingerprints: A Universal Representation for AI-Generated Image Detection","primary_cat":"cs.CV","submitted_at":"2026-05-08T00:48:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"ODP-Net uses instance-aware orthogonal decomposition, perturbation-based purification, and manifold alignment to separate universal forgery traces, generator fingerprints, and semantics, achieving SOTA on unseen architectures like Stable Diffusion 3.","context_count":1,"top_context_role":"baseline","top_context_polarity":"baseline","context_text":"framework maps diverse forgeries onto a compact, generalized manifold unseen data where those specific fingerprints or semantics are absent. More details are given in the supplementary material (S1). The impact of this entanglement is evident in the la- tent space visualizations (Fig. 4). Existing methods produce fragmented clusters (FatFormer[6]) or dispersed distributions (AIDE[7]), indicating that the model is classifying based on \"who generated the image\" rather than \"whether the image is fake.\" To achieve true generalization, a detector must discard the \"who\" and the \"what\" (semantics) to focus solely on the \"whether\" (universal traces). To this end, we propose Orthogonal Decomposition and Purification Network (ODP-Net), a unified framework that"},{"citing_arxiv_id":"2605.08226","ref_index":26,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"SPECTRA-Net: Scalable Pipeline for Explainable Cross-domain Tensor Representations for AI-generated Images Detection","primary_cat":"cs.CV","submitted_at":"2026-05-06T16:35:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"SPECTRA-Net fuses multi-view tensor representations from vision foundation models, spectral analysis, local anomaly detection, and statistical descriptors to achieve state-of-the-art cross-domain AI-generated image detection with explainable artifact localization.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Frequency-aware detectors [6] and hybrid RGB-FFT pipelines [ 15] have shown promising results, particularly under controlled settings. Recent advances in dual-domain processing, such as wavelet-based diffusion models for blind image separation [ 9], demonstrate the effectiveness of combining frequency and spatial features for artifact removal. Complementarily, patch-based and pixel-level approaches such as LaDeDa [ 2] and PatchCraft [26] highlight the importance of local- ized texture inconsistencies and enable spatial explainability through anomaly heatmaps. While effective, these methods often rely on a single representation domain-either frequency, spatial texture, or semantic features-which limits robustness under diverse real-world transformations such as JPEG compression, Gaussian blur, and platform re-encoding, where single-modality"},{"citing_arxiv_id":"2605.04445","ref_index":50,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"LEGO: LoRA-Enabled Generator-Oriented Framework for Synthetic Image Detection","primary_cat":"cs.CV","submitted_at":"2026-05-06T03:21:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"LEGO uses multiple generator-specific LoRA modules modulated by an MLP and fused with attention to detect synthetic images, achieving better performance than prior methods while using under 10% of the training data.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.04358","ref_index":56,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Intermediate Representations are Strong AI-Generated Image Detectors","primary_cat":"cs.CV","submitted_at":"2026-05-05T23:26:02+00:00","verdict":"UNVERDICTED","verdict_confidence":"MODERATE","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Intermediate layer embedding sensitivity to perturbations distinguishes AI-generated images from real ones, yielding higher AUROC on GenImage and Forensics Small benchmarks than prior methods.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.16879","ref_index":55,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Adaptive Forensic Feature Refinement via Intrinsic Importance Perception","primary_cat":"cs.CV","submitted_at":"2026-04-18T07:07:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"I2P adaptively selects the most discriminative layers from visual foundation models for synthetic image detection and constrains task updates to low-sensitivity parameter subspaces to improve specificity without harming generalization.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.12307","ref_index":38,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Boosting Robust AIGI Detection with LoRA-based Pairwise Training","primary_cat":"cs.CV","submitted_at":"2026-04-14T05:35:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"LoRA-based pairwise training with distortion and size simulations boosts robust AIGI detection under severe distortions, placing third in the NTIRE challenge.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.11487","ref_index":86,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild","primary_cat":"cs.CV","submitted_at":"2026-04-13T13:53:11+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"The NTIRE 2026 challenge provides a dataset of over 294,000 real and AI-generated images with 36 transformations to benchmark robust detection models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2602.01738","ref_index":32,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Simplicity Prevails: The Emergence of Generalizable AIGI Detection in Visual Foundation Models","primary_cat":"cs.CV","submitted_at":"2026-02-02T07:20:02+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Frozen features from vision foundation models enable a linear probe to outperform specialized AIGI detectors by over 30% on in-the-wild data due to emergent forgery knowledge from pre-training.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2512.12982","ref_index":70,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Scaling Up AI-Generated Image Detection with Generator-Aware Prototypes","primary_cat":"cs.CV","submitted_at":"2025-12-15T04:58:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"GAPL learns a compact set of canonical forgery prototypes and applies two-stage LoRA training to build a low-variance feature space that improves generalization across GAN and diffusion generators.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2511.16136","ref_index":73,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"How Noise Benefits AI-generated Image Detection","primary_cat":"cs.CV","submitted_at":"2025-11-20T08:16:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"PiN-CLIP jointly trains a noise generator and detector under a variational positive-incentive principle to inject feature-space noise that suppresses shortcut directions and improves out-of-distribution accuracy by 5.4 points on images from 42 generative models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2509.21864","ref_index":47,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Deepfakes: we need to re-think the concept of \"real\" images","primary_cat":"cs.CV","submitted_at":"2025-09-26T04:40:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"This position paper contends that the concept of 'real' images must be rethought because most modern photographs are computationally generated, undermining current deepfake detection methods.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2507.10236","ref_index":46,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Navigating the Challenges of AI-Generated Image Detection in the Wild: What Truly Matters?","primary_cat":"cs.CV","submitted_at":"2025-07-14T12:56:55+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"The ITW-SM dataset and targeted optimization of detector design choices yield a 26.87% average AUC improvement for state-of-the-art AI-generated image detectors under real-world social media conditions.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2503.21210","ref_index":84,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Toward Generalizable Forgery Detection and Reasoning","primary_cat":"cs.CV","submitted_at":"2025-03-27T06:54:06+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"FakeReasoning is an MLLM-based framework for unified forgery detection and reasoning on AI-generated images, supported by the new MMFR-Dataset of 120K images and 378K annotations across 10 generators.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2412.20704","ref_index":52,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"HFI: A unified framework for training-free detection and implicit watermarking of latent diffusion model generated images","primary_cat":"cs.CV","submitted_at":"2024-12-30T04:34:42+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"HFI detects LDM-generated images without training data by quantifying aliasing in autoencoder outputs and supports model-specific implicit watermarking.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}