Repro-bench: Can agentic ai systems assess the reproducibility of social science research?

Association for Computational Linguistics · 2025 · DOI 10.18653/v1/2025.findings-acl.1210

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

ReproRepo: Scaling Reproducibility Audits with GitHub Repository Issues

cs.CL · 2026-06-16 · unverdicted · novelty 7.0

ReproRepo uses GitHub issues as natural supervision to benchmark LLM agents on detecting reproducibility blockers across 1,149 ML papers, with the top agent finding related issues for roughly 90% of cases.

ARA: Agentic Reproducibility Assessment For Scalable Support Of Scientific Peer-Review

cs.DL · 2026-05-04 · unverdicted · novelty 6.0 · 2 refs

ARA uses LLMs to build workflow graphs linking sources, methods, and outputs in papers, then scores reproducibility, reaching ~61% accuracy on 213 ReScience C articles and outperforming priors on ReproBench and GoldStandardDB.

citing papers explorer

Showing 2 of 2 citing papers after filters.

ReproRepo: Scaling Reproducibility Audits with GitHub Repository Issues cs.CL · 2026-06-16 · unverdicted · none · ref 7
ReproRepo uses GitHub issues as natural supervision to benchmark LLM agents on detecting reproducibility blockers across 1,149 ML papers, with the top agent finding related issues for roughly 90% of cases.
ARA: Agentic Reproducibility Assessment For Scalable Support Of Scientific Peer-Review cs.DL · 2026-05-04 · unverdicted · none · ref 29 · 2 links
ARA uses LLMs to build workflow graphs linking sources, methods, and outputs in papers, then scores reproducibility, reaching ~61% accuracy on 213 ReScience C articles and outperforming priors on ReproBench and GoldStandardDB.

Repro-bench: Can agentic ai systems assess the reproducibility of social science research?

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer