Distributed Peer Review at ALMA: An Empirical Comparison with Panel-Based Review
Pith reviewed 2026-06-26 11:20 UTC · model grok-4.3
The pith
Distributed peer review at ALMA produces ranking trends that match those from panel evaluations across demographics and science areas.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DPR largely reproduces the systematic ranking trends observed in panel evaluations across PI demographics, technical characteristics, and scientific areas, consistent with panel outcomes both before and after discussion. Scientific diversity among the top-ranked proposals is similar between DPR and post-discussion panel rankings. Individual proposal ranks show substantial dispersion under both DPR and panel assessments prior to discussion, with discussion only partially reducing this variance. The observed dispersion therefore reflects intrinsic variation in reviewer judgments rather than a byproduct of the distributed process itself.
What carries the argument
Distributed peer review, in which each PI designates a reviewer to evaluate a set of proposals without collective discussion, used to compare population-level ranking structures against traditional panel review.
If this is right
- Systematic ranking patterns by demographics, technical features, and science area remain comparable under both review methods.
- The share of top-ranked proposals from different scientific areas stays similar to post-discussion panel results.
- Large dispersion in individual proposal ranks occurs in both systems and is only partly reduced by panel discussion.
- The majority of written comments receive high or adequate quality ratings with no clear link to reviewer career stage.
Where Pith is reading between the lines
- Observatories facing rising proposal numbers could switch to distributed review while expecting little change in the overall distribution of awarded time.
- Targeted checks on the roughly 10 percent of low-quality comments would be required to sustain standards at the scale of 16,000 reviews per cycle.
- Because rank variation appears inherent to individual judgments, assigning more than one reviewer per proposal might narrow the spread without changing the review format.
Load-bearing premise
The sets of proposals and reviewers before and after the switch to distributed review are similar enough that any matching patterns can be credited to the review method rather than other simultaneous changes.
What would settle it
A clear divergence in ranking trends by proposer demographics or scientific area between the distributed-review cycles and the earlier panel cycles, once any shifts in proposal volume or demographics are accounted for.
Figures
read the original abstract
Large astronomical observatories are increasingly adopting distributed peer review (DPR) to manage growing proposal volumes, yet empirical comparisons with panel-based systems remain limited. Beginning in 2021 (Cycle 8), the Atacama Large Millimeter/submillimeter Array (ALMA) transitioned to DPR for the majority of proposals, with DPR applied to nearly all proposals from Cycle 9 onward. Under DPR, each Principal Investigator (PI) designates a reviewer to evaluate a set of proposals without collective discussion. This study analyzes over 20,000 proposals and 160,000 reviews spanning 13 cycles to assess the impact of this change. We find that DPR largely reproduces the systematic ranking trends observed in panel evaluations across PI demographics, technical characteristics, and scientific areas, consistent with panel outcomes both before and after discussion. Scientific diversity among the top-ranked proposals is similar between DPR and post-discussion panel rankings. Individual proposal ranks show substantial dispersion under both DPR and panel assessments prior to discussion, with discussion only partially reducing this variance. The observed dispersion therefore reflects intrinsic variation in reviewer judgments rather than a byproduct of the distributed process itself. In Cycle 12, reviewers rated the majority of DPR written comments as high or adequate quality, with no dependence on reviewer career stage. However, 10% of reviews were rated as low quality, highlighting the challenge of maintaining quality standards across the approximately 16,000 reviews produced each cycle. Overall, our results indicate that DPR reproduces the population-level ranking structure obtained in panel review, despite differences in review mechanics and the role of discussion.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript analyzes over 20,000 ALMA proposals and 160,000 reviews across 13 cycles to empirically compare distributed peer review (DPR, adopted from Cycle 8 onward) with prior panel-based review. It concludes that DPR largely reproduces the systematic ranking trends seen in panel evaluations across PI demographics, technical characteristics, and scientific areas; that scientific diversity among top-ranked proposals is similar; that rank dispersion reflects intrinsic reviewer judgment variation rather than the distributed format; and that most DPR written comments are rated high or adequate quality (with 10% low quality).
Significance. If the central empirical claims hold after addressing pool-comparability issues, the work supplies a large-scale, population-level test of DPR viability for high-volume observatories. The dataset size (20k proposals, 160k reviews) is a clear strength that supports inferences about aggregate ranking structure and diversity, and the finding that discussion only partially reduces variance is a useful falsifiable observation.
major comments (2)
- [Methods] The central claim that similarities in ranking trends can be attributed to DPR mechanics (rather than concurrent changes in the proposal pool) requires evidence that pre- and post-Cycle 8 populations are comparable. No balance tables, covariate-shift tests, or explicit controls for science-area mix, PI career-stage distribution, or proposal volume at the transition are described.
- [Results] The abstract states that DPR reproduces trends 'consistent with panel outcomes both before and after discussion,' but the manuscript supplies no description of the statistical methods, regression specifications, or cycle-fixed effects used to establish this consistency or to isolate the review-process effect.
minor comments (2)
- [Abstract] The abstract would benefit from a one-sentence statement of the statistical approach and any controls applied.
- [Figures/Tables] Figure captions and table notes should explicitly state the sample sizes and cycle ranges used for each panel or comparison.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which help clarify the attribution of our findings to the review process. We address each major comment below.
read point-by-point responses
-
Referee: [Methods] The central claim that similarities in ranking trends can be attributed to DPR mechanics (rather than concurrent changes in the proposal pool) requires evidence that pre- and post-Cycle 8 populations are comparable. No balance tables, covariate-shift tests, or explicit controls for science-area mix, PI career-stage distribution, or proposal volume at the transition are described.
Authors: We agree that the manuscript does not currently include balance tables, covariate-shift tests, or explicit controls for pool composition at the Cycle 8 transition. To support the claim that ranking similarities arise from DPR mechanics rather than pool changes, the revised manuscript will add balance tables for science-area mix, PI career stage, and proposal volume, along with regression models incorporating cycle-fixed effects and relevant covariates to isolate the review-format effect. revision: yes
-
Referee: [Results] The abstract states that DPR reproduces trends 'consistent with panel outcomes both before and after discussion,' but the manuscript supplies no description of the statistical methods, regression specifications, or cycle-fixed effects used to establish this consistency or to isolate the review-process effect.
Authors: The referee is correct that the manuscript lacks a detailed description of the statistical methods used to establish consistency between DPR and panel rankings. In revision we will expand the Methods section to specify the regression specifications, any cycle-fixed effects, and the criteria applied to assess consistency across the pre- and post-discussion panel outcomes. revision: yes
Circularity Check
No circularity; purely empirical data comparison
full rationale
The paper performs a direct empirical comparison of ranking trends in >20,000 proposals across 13 ALMA cycles before and after the DPR transition. No derivations, equations, fitted parameters, or predictions appear in the abstract or described methods. All claims rest on observed statistics (rank correlations, dispersion, quality ratings) rather than any self-referential construction, self-citation load-bearing step, or ansatz. The central result is therefore independent of its own inputs and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The pre- and post-DPR transition proposal pools and reviewer populations are sufficiently comparable that observed similarities in ranking trends can be attributed to the review process rather than to concurrent changes in science areas, demographics, or proposal volume.
Reference graph
Works this paper leans on
-
[1]
Systematics in the ALMA Proposal Review Rankings. , keywords =. doi:10.1088/1538-3873/ab3e18 , archivePrefix =. 1908.09639 , primaryClass =
-
[2]
Bulletin of the American Astronomical Society , year = 2022, volume =
Analysis of the ALMA Cycle 8 Distributed Peer Review Process. Bulletin of the American Astronomical Society , year = 2022, volume =. doi:10.3847/25c2cfeb.4ece85d4 , archivePrefix =. 2204.05390 , primaryClass =
-
[3]
Update on the Systematics in the ALMA Proposal Review Process After Cycle 8. , keywords =. doi:10.1088/1538-3873/ac5b89 , archivePrefix =. 2203.11334 , primaryClass =
-
[4]
M., Corvill´ on, A., & Shah, N
Enhancing Peer Review in Astronomy: A Machine Learning and Optimization Approach to Reviewer Assignments for ALMA. , keywords =. doi:10.1088/1538-3873/adb5c1 , archivePrefix =. 2410.10009 , primaryClass =
-
[5]
The Astropy Project: Building an inclusive, open-science project and status of the v2.0 core package
The Astropy Project: Building an Open-science Project and Status of the v2.0 Core Package. , keywords =. doi:10.3847/1538-3881/aabc4f , archivePrefix =. 1801.02634 , primaryClass =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.3847/1538-3881/aabc4f
-
[6]
Jones, Eric and Oliphant, Travis and Peterson, Pearu and others , description =
-
[7]
2019 , note =
kSamples: K-Sample Rank Tests and their Combinations , author =. 2019 , note =
2019
-
[8]
2019 , url =
R: A Language and Environment for Statistical Computing , author =. 2019 , url =
2019
-
[9]
Peer Review Under Review a Statistical Study on Proposal Ranking at ESO. Part I: the Pre-meeting Phase. , keywords =. doi:10.1088/1538-3873/aac463 , archivePrefix =. 1805.06981 , primaryClass =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1088/1538-3873/aac463
-
[10]
Mikael Fogelholm and Saara Leppinen and Anssi Auvinen and Jani Raitanen and Anu Nuutinen and Kalervo Väänänen , keywords =. Panel discussion does not improve reliability of peer review for medical research grant proposals , journal =. 2012 , issn =. doi:https://doi.org/10.1016/j.jclinepi.2011.05.001 , url =
-
[11]
Stephen Cole and Jonathan R. Cole and Gary A. Simon , title =. Science , volume =. 1981 , doi =. https://www.science.org/doi/pdf/10.1126/science.7302566 , abstract =
-
[12]
and Raclaw, Joshua and Kaatz, Anna and Brauer, Markus and Carnes, Molly and Nathan, Mitchell J
Pier, Elizabeth L. and Raclaw, Joshua and Kaatz, Anna and Brauer, Markus and Carnes, Molly and Nathan, Mitchell J. and Ford, Cecilia E. , title =. Research Evaluation , volume =. 2017 , month =. doi:10.1093/reseval/rvw025 , url =
-
[13]
Pier and Markus Brauer and Amarette Filut and Anna Kaatz and Joshua Raclaw and Mitchell J
Elizabeth L. Pier and Markus Brauer and Amarette Filut and Anna Kaatz and Joshua Raclaw and Mitchell J. Nathan and Cecilia E. Ford and Molly Carnes , title =. Proceedings of the National Academy of Sciences , volume =. 2018 , doi =. https://www.pnas.org/doi/pdf/10.1073/pnas.1714379115 , abstract =
-
[14]
Research Evaluation , volume =
Oxley, Kristin , title =. Research Evaluation , volume =. 2025 , month =. doi:10.1093/reseval/rvaf047 , url =
-
[15]
Research Evaluation , author=
Examining the value added by committee discussion in the review of applications for research awards , year=. Research Evaluation , author=. doi:None , url=
-
[16]
The Messenger , year = 2019, month = sep, volume =
The Distributed Peer Review Experiment. The Messenger , year = 2019, month = sep, volume =. doi:10.18727/0722-6691/5147 , adsurl =
-
[17]
The First Results of Distributed Peer Review at ESO Show Promising Outcomes. The Messenger , keywords =. doi:10.18727/0722-6691/5316 , archivePrefix =. 2305.09277 , primaryClass =
-
[18]
Astronomy & Geophysics , keywords =
Telescope time without tears: a distributed approach to peer review. Astronomy and Geophysics , keywords =. doi:10.1111/j.1468-4004.2009.50416.x , archivePrefix =. 0906.1943 , primaryClass =
-
[19]
The Gemini Observatory Fast Turnaround Program
The Gemini Observatory fast turnaround program. Observatory Operations: Strategies, Processes, and Systems V , year = 2014, editor =. doi:10.1117/12.2057145 , archivePrefix =. 1408.5916 , primaryClass =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1117/12.2057145 2014
-
[20]
Mayo and James Brophy and Mark S
Nancy E. Mayo and James Brophy and Mark S. Goldberg and Marina B. Klein and Sydney Miller and Robert W. Platt and Judith Ritchie , keywords =. Peering at peer review revealed high degree of chance associated with funding of grant applications , journal =. 2006 , issn =. doi:https://doi.org/10.1016/j.jclinepi.2005.12.007 , url =
-
[21]
The Social Science Journal , volume =
John Jerrim and Robert Vries , title =. The Social Science Journal , volume =. 2023 , publisher =. doi:10.1080/03623319.2020.1728506 , URL =
-
[22]
Analysis of the ALMA Cycle 7 Supplemental Call , institution =
Carpenter, John and Donovan Meyer, Jennifer and Corvill. Analysis of the ALMA Cycle 7 Supplemental Call , institution =. 2020 , month = nov, url =
2020
-
[23]
The Messenger , year = 2025, month = mar, volume =
Distributed Peer Review at ESO: Demonstrating Success and Evolving Through Period 115. The Messenger , year = 2025, month = mar, volume =. doi:10.18727/0722-6691/5383 , adsurl =
-
[24]
2009 , publisher =
Michele Lamont , title =. 2009 , publisher =
2009
-
[25]
Nepotism and sexism in peer-review , journal =
Wenner. Nepotism and sexism in peer-review , journal =. 1997 , doi =
1997
-
[26]
Shah, Nihar B. , title =. Commun. ACM , month = may, pages =. 2022 , issue_date =. doi:10.1145/3528086 , abstract =
-
[27]
2005 , publisher=
The Wisdom of Crowds , author=. 2005 , publisher=
2005
-
[28]
Gender-Related Systematics in the NRAO and ALMA Proposal Review Processes
Gender-Related Systematics in the NRAO and ALMA Proposal Review Processes. arXiv e-prints , keywords =. doi:10.48550/arXiv.1611.04795 , archivePrefix =. 1611.04795 , primaryClass =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1611.04795
-
[29]
Lutz Prechelt and Daniel Graziotin and Daniel Méndez Fernández , keywords =. A community’s perspective on the status and future of peer review in software engineering , journal =. 2018 , issn =. doi:https://doi.org/10.1016/j.infsof.2017.10.019 , url =
-
[30]
PeerJ Computer Science , volume =
Frachtenberg, Eitan and Koster, Niels , title =. PeerJ Computer Science , volume =. 2020 , doi =
2020
-
[31]
Alexander Goldberg and Ivan Stelmakh and Kyunghyun Cho and Alice Oh and Alekh Agarwal and Danielle Belgrave and Nihar B. Shah , title =. PLOS ONE , year =. doi:10.1371/journal.pone.0320444 , url =
-
[32]
Lee, Carole J. and Sugimoto, Cassidy R. and Zhang, Guo and Cronin, Blaise , title =. Journal of the American Society for Information Science and Technology , volume =. doi:https://doi.org/10.1002/asi.22784 , url =. https://asistdl.onlinelibrary.wiley.com/doi/pdf/10.1002/asi.22784 , abstract =
-
[33]
F1000Research , volume=
What do we know about grant peer review in the health sciences? , author=. F1000Research , volume=
-
[34]
doi:10.5281/zenodo.14933777 , url =
ALMA Cycle 12 Proposer's Guide , author =. doi:10.5281/zenodo.14933777 , url =
-
[35]
Anna Butters and Melanie Benson Marshall and Stephen Pinfield and Tom Stafford and Alexander Bondarenko and Barbara Neubauer and Robert Nuske and Pierre Schwidlinski and Hanna Denecke. Applicants as Reviewers: Evaluating the Risks, Benefits, and Potential of Distributed Peer Review for Grant Funding Allocations (RoRI Working Paper No. 17). 2025. doi:10.60...
-
[36]
Distributed peer review enhanced with natural language processing and machine learning. Nature Astronomy , keywords =. doi:10.1038/s41550-020-1038-y , archivePrefix =. 2004.04165 , primaryClass =
-
[37]
doi:10.5281/zenodo.20618033 , url =
Carpenter, John and Corvill\'on, Andrea , title =. doi:10.5281/zenodo.20618033 , url =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.