AI-Assisted Peer Review at Scale: The AAAI-26 AI Review Pilot

· 2026 · cs.AI · arXiv 2604.13940

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Scientific peer review faces mounting strain as submission volumes surge, making it increasingly difficult to sustain review quality, consistency, and timeliness. Recent advances in AI have led the community to consider its use in peer review, yet a key unresolved question is whether AI can generate technically sound reviews at real-world conference scale. Here we report the first large-scale field deployment of AI-assisted peer review: every main-track submission at AAAI-26 received one clearly identified AI review from a state-of-the-art system. The system combined frontier models, tool use, and safeguards in a multi-stage process to generate reviews for all 22,977 full-review papers in less than a day. A large-scale survey of AAAI-26 authors and program committee members showed that participants not only found AI reviews useful, but actually preferred them to human reviews on key dimensions such as technical accuracy and research suggestions. We also introduce a novel benchmark and find that our system substantially outperforms a simple LLM-generated review baseline at detecting a variety of scientific weaknesses. Together, these results show that state-of-the-art AI methods can already make meaningful contributions to scientific peer review at conference scale, opening a path toward the next generation of synergistic human-AI teaming for evaluating research.

representative citing papers

Can AI Review Improve Paper Drafting? An Empirical Study on 20 Computer Architecture Submissions

cs.AI · 2026-05-31 · unverdicted · novelty 5.0

An empirical study on 20 architecture papers finds AI reviews capture a significant fraction of human-raised issues while also surfacing additional ones, using a released tool that clusters AI comments for comparison.

Towards Automating Scientific Review with Google's Paper Assistant Tool

cs.LG · 2026-06-26 · unverdicted · novelty 4.0

Presents PAT, an agentic AI review tool using inference scaling that claims 34% better math error recall on SPOT benchmark and successful pilots at STOC and ICML conferences.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Towards Automating Scientific Review with Google's Paper Assistant Tool cs.LG · 2026-06-26 · unverdicted · none · ref 2 · internal anchor
Presents PAT, an agentic AI review tool using inference scaling that claims 34% better math error recall on SPOT benchmark and successful pilots at STOC and ICML conferences.

AI-Assisted Peer Review at Scale: The AAAI-26 AI Review Pilot

fields

years

verdicts

representative citing papers

citing papers explorer