Cosmos: Catch- ing out-of-context misinformation with self-supervised learning

Cosmos: Catching out-of-context misinformation with self-supervised learning , author= · 2021 · arXiv 2101.06278

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

representative citing papers

VeriTaS: The First Dynamic Benchmark for Multimodal Automated Fact-Checking

cs.IR · 2026-01-13 · conditional · novelty 8.0

VeriTaS is the first dynamic benchmark for multimodal automated fact-checking that updates quarterly with real-world claims and a standardized scoring scheme to resist data leakage.

ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection

cs.AI · 2026-06-23 · unverdicted · novelty 7.0

ReMMD presents ReMMDBench (500 samples, 2756 images, five languages, five-way veracity) and ReMMD-Agent, which achieves 41.80% accuracy and 39.12% macro-F1 on five-way classification with GPT-5.2 while cutting costs versus prior agents.

When Seeing Is Not Believing -- A Benchmark for Search-Grounded Video Misinformation Detection

cs.CV · 2026-06-02 · unverdicted · novelty 7.0

EVID-Bench supplies 222 videos across nine manipulation types in three categories and shows that frontier multimodal models reach at most 61.43% point-level accuracy when forced to use web search to identify false information.

SynCred-Bench: Benchmarking Synthetic Credibility in AI-Generated Visual Misinformation

cs.CV · 2026-06-02 · unverdicted · novelty 7.0

SynCred-Bench shows that 15 MLLMs reach only 10.5% TPR, open-source detectors under 5%, commercial APIs 57.6%, and humans 63% TPR at 5% FPR when identifying AI-generated images with synthetic credibility.

The Warrant Gap: Claim-Conditioned Re-scoring for Fact-Checking

cs.CL · 2026-06-23 · unverdicted · novelty 6.0

Introduces claim-conditioned re-scoring (SIFT) and warranted supports proportion (WSP) metric, reporting accuracy recovery up to 27.6 points and WSP calibration at AUC 0.92 on FEVER, SciFact and other benchmarks.

T-IMPACT: A Severity-Aware Benchmark for Contextual Image-Text Manipulation

cs.CV · 2026-06-21 · unverdicted · novelty 6.0

T-IMPACT is a new benchmark dataset and pipeline that supplies nearly 99k manipulated image-text pairs together with a human-calibrated continuous severity signal for contextual interpretation change.

Fact-Checking with Contextual Narratives: Leveraging Retrieval-Augmented LLMs for Social Media Analysis

cs.MM · 2025-04-14 · unverdicted · novelty 5.0

CRAVE is a new framework that clusters retrieved text and image evidence into narratives and uses an LLM judge to produce explained fact-checking verdicts.

citing papers explorer

Showing 7 of 7 citing papers.

VeriTaS: The First Dynamic Benchmark for Multimodal Automated Fact-Checking cs.IR · 2026-01-13 · conditional · none · ref 1
VeriTaS is the first dynamic benchmark for multimodal automated fact-checking that updates quarterly with real-world claims and a standardized scoring scheme to resist data leakage.
ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection cs.AI · 2026-06-23 · unverdicted · none · ref 54
ReMMD presents ReMMDBench (500 samples, 2756 images, five languages, five-way veracity) and ReMMD-Agent, which achieves 41.80% accuracy and 39.12% macro-F1 on five-way classification with GPT-5.2 while cutting costs versus prior agents.
When Seeing Is Not Believing -- A Benchmark for Search-Grounded Video Misinformation Detection cs.CV · 2026-06-02 · unverdicted · none · ref 13
EVID-Bench supplies 222 videos across nine manipulation types in three categories and shows that frontier multimodal models reach at most 61.43% point-level accuracy when forced to use web search to identify false information.
SynCred-Bench: Benchmarking Synthetic Credibility in AI-Generated Visual Misinformation cs.CV · 2026-06-02 · unverdicted · none · ref 32
SynCred-Bench shows that 15 MLLMs reach only 10.5% TPR, open-source detectors under 5%, commercial APIs 57.6%, and humans 63% TPR at 5% FPR when identifying AI-generated images with synthetic credibility.
The Warrant Gap: Claim-Conditioned Re-scoring for Fact-Checking cs.CL · 2026-06-23 · unverdicted · none · ref 90
Introduces claim-conditioned re-scoring (SIFT) and warranted supports proportion (WSP) metric, reporting accuracy recovery up to 27.6 points and WSP calibration at AUC 0.92 on FEVER, SciFact and other benchmarks.
T-IMPACT: A Severity-Aware Benchmark for Contextual Image-Text Manipulation cs.CV · 2026-06-21 · unverdicted · none · ref 2
T-IMPACT is a new benchmark dataset and pipeline that supplies nearly 99k manipulated image-text pairs together with a human-calibrated continuous severity signal for contextual interpretation change.
Fact-Checking with Contextual Narratives: Leveraging Retrieval-Augmented LLMs for Social Media Analysis cs.MM · 2025-04-14 · unverdicted · none · ref 20
CRAVE is a new framework that clusters retrieved text and image evidence into narratives and uses an LLM judge to produce explained fact-checking verdicts.

Cosmos: Catch- ing out-of-context misinformation with self-supervised learning

fields

years

verdicts

representative citing papers

citing papers explorer