Keep all tabs open throughout your search so that you can accurately record all resources and queries at the end

Do NOT close any tabs while searching

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments

cs.CL · 2026-04-15 · unverdicted · novelty 6.0

MERRIN benchmark shows AI agents average only 22.3% accuracy on multimodal evidence retrieval and multi-hop reasoning over noisy conflicting web sources, with the best reaching 40.1%.

citing papers explorer

Showing 1 of 1 citing paper.

MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments cs.CL · 2026-04-15 · unverdicted · none · ref 8
MERRIN benchmark shows AI agents average only 22.3% accuracy on multimodal evidence retrieval and multi-hop reasoning over noisy conflicting web sources, with the best reaching 40.1%.

Keep all tabs open throughout your search so that you can accurately record all resources and queries at the end

fields

years

verdicts

representative citing papers

citing papers explorer