Newsclippings: Auto- matic generation of out-of-context multimodal media

· 2021 · arXiv 2104.05893

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

representative citing papers

CLIPScore: A Reference-free Evaluation Metric for Image Captioning

cs.CV · 2021-04-18 · conditional · novelty 8.0

CLIPScore uses a web-pretrained CLIP model to evaluate image captions without references and achieves higher human correlation than CIDEr or SPICE.

XNote: Benchmarking Automated Community Notes Generation for Image-based Contextual Deception

cs.CL · 2026-03-23 · unverdicted · novelty 7.0

The XNote dataset and LVLM benchmarks demonstrate that current models face significant challenges in generating accurate, grounded Community Notes for image-based contextual deception.

RW-Post: Auditable Evidence-Grounded Multimodal Fact-Checking in the Wild

cs.AI · 2025-12-28 · unverdicted · novelty 6.0

RW-Post is an auditable benchmark linking social media posts to evidence from human fact-check articles for evaluating multimodal AI fact-checking across different evidence regimes.

RW-Post: Auditable Evidence-Grounded Multimodal Fact-Checking in the Wild

cs.MM · 2026-05-11 · unverdicted · novelty 5.0

RW-Post is an auditable text-image benchmark for real-world multimodal fact-checking that links posts to evidence traces from human fact-check articles and includes the AgentFact baseline for evaluation.

Fact-Checking with Contextual Narratives: Leveraging Retrieval-Augmented LLMs for Social Media Analysis

cs.MM · 2025-04-14 · unverdicted · novelty 5.0

CRAVE is a new framework that clusters retrieved text and image evidence into narratives and uses an LLM judge to produce explained fact-checking verdicts.

citing papers explorer

Showing 5 of 5 citing papers.

CLIPScore: A Reference-free Evaluation Metric for Image Captioning cs.CV · 2021-04-18 · conditional · none · ref 34
CLIPScore uses a web-pretrained CLIP model to evaluate image captions without references and achieves higher human correlation than CIDEr or SPICE.
XNote: Benchmarking Automated Community Notes Generation for Image-based Contextual Deception cs.CL · 2026-03-23 · unverdicted · none · ref 30
The XNote dataset and LVLM benchmarks demonstrate that current models face significant challenges in generating accurate, grounded Community Notes for image-based contextual deception.
RW-Post: Auditable Evidence-Grounded Multimodal Fact-Checking in the Wild cs.AI · 2025-12-28 · unverdicted · none · ref 35
RW-Post is an auditable benchmark linking social media posts to evidence from human fact-check articles for evaluating multimodal AI fact-checking across different evidence regimes.
RW-Post: Auditable Evidence-Grounded Multimodal Fact-Checking in the Wild cs.MM · 2026-05-11 · unverdicted · none · ref 26
RW-Post is an auditable text-image benchmark for real-world multimodal fact-checking that links posts to evidence traces from human fact-check articles and includes the AgentFact baseline for evaluation.
Fact-Checking with Contextual Narratives: Leveraging Retrieval-Augmented LLMs for Social Media Analysis cs.MM · 2025-04-14 · unverdicted · none · ref 10
CRAVE is a new framework that clusters retrieved text and image evidence into narratives and uses an LLM judge to produce explained fact-checking verdicts.

Newsclippings: Auto- matic generation of out-of-context multimodal media

fields

years

verdicts

representative citing papers

citing papers explorer