DarkAgents
Pith reviewed 2026-06-27 12:32 UTC · model grok-4.3
The pith
DarkAgents deploys multi-agent LLMs with deterministic code to build audited pipelines from scale-invariant models to NANOGrav gravitational-wave fits.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DarkAgent-PT applies the multi-agent system to a classically scale-invariant particle-physics model, derives best-fit values for the parameters that reproduce the NANOGrav spectrum via a first-order phase transition, compiles the relevant experimental and observational bounds on those parameters, and supplies an explicit audit of all assumptions and priors that entered the calculation and the constraint list. The same runs expose inconsistencies in some previously published fits and generate new ones that employ the dissipative bulk-flow gravitational-wave template.
What carries the argument
The multi-agent orchestration layer that combines LLM reasoning and code generation with deterministic, tested human-written code to assemble and execute the full pipeline from model definition through phase-transition dynamics, gravitational-wave prediction, parameter fitting, constraint lookup, and assumption auditing.
If this is right
- The framework can flag inconsistencies between new and existing literature fits for the same class of models.
- It can generate new fits that incorporate alternative gravitational-wave templates such as the dissipative bulk-flow spectrum.
- Every run produces a machine-readable audit report that makes the choice of priors and approximations explicit for later scrutiny.
- The same architecture supports different underlying language models, including local deployments, without changing the pipeline logic.
- Public release of the code allows direct reuse and extension to other astroparticle calculations that require similar multi-step modeling.
Where Pith is reading between the lines
- The audit-report feature could be adopted as a minimal reproducibility standard for any future parameter fit in the field.
- Once validated on additional benchmarks, the approach might shorten the time between proposing a new particle-physics model and obtaining its observational constraints.
- The same orchestration pattern could be tested on collider-phenomenology pipelines that likewise combine model building, event generation, and experimental-limit comparisons.
- Local-LLM variants would allow the entire workflow to run without external API calls, which may matter for computations involving proprietary or sensitive model details.
Load-bearing premise
The reasoning and code-generation steps performed by the language models remain reliable enough, when paired with deterministic human code, that they introduce neither undetected calculation errors nor invalid modeling assumptions into the final pipelines and fits.
What would settle it
Application of the system to a benchmark first-order transition model whose correct best-fit values and assumption list are already established in the literature; mismatch between the system output and the known correct values, or failure to flag a known flawed assumption, would falsify the reliability claim.
Figures
read the original abstract
We present DarkAgents: a multi-agent system that leverages the reasoning and code-generation capabilities of large language models (LLMs), together with deterministic tested human-written code, to build orchestrated pipelines for theoretical astroparticle physics research. While related approaches have been proposed in collider physics and cosmology, DarkAgents targets the specific challenges of this domain, such as model building, complex pipeline computations, multiple constraints and assumption auditing. The framework can be powered by different agentic command-line tools, including Mistral's, Anthropic's, OpenAI's and local LLMs via Ollama. As first implementation, we apply DarkAgents to the study of cosmological first order transitions, starting from a classically scale-invariant particle-physics model and ending with the fit to the NANOGrav nanohertz gravitational-waves spectrum. DarkAgent-PT provides as output i) the best-fit values of model parameters, ii) their existing experimental and observational constraints, iii) an audit report of the assumptions and priors entering both i) and ii), of particular relevance for astroparticle physics. Our test runs identify inconsistencies in some fits in the literature and produce novel ones based on the dissipative bulk-flow GW template. The code is publicly available at https://github.com/PhysicsZandi/DarkAgents.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces DarkAgents, a multi-agent system combining LLM reasoning and code-generation capabilities with deterministic human-written code to construct pipelines for theoretical astroparticle physics. As a first application, DarkAgent-PT is used to study a classically scale-invariant particle-physics model, fit it to the NANOGrav nanohertz gravitational-wave spectrum, and output best-fit parameter values, existing experimental/observational constraints, and an audit report of assumptions and priors. The work claims that test runs identify inconsistencies in some literature fits and produce novel fits based on the dissipative bulk-flow GW template. The code is publicly available.
Significance. If the outputs can be independently verified as correct, the framework could help manage the complexity of model building, multi-constraint analyses, and assumption auditing in astroparticle physics. The public code release is a clear strength that enables reproducibility checks. At present, however, the significance is limited because the central claims rest on unverified LLM-generated pipelines.
major comments (3)
- [Abstract] Abstract: the claim that the system 'identifies inconsistencies in some fits in the literature and produce novel ones' is presented without any specific examples, comparison tables, or quantitative differences from prior results, which is load-bearing for the asserted utility of the framework.
- [Abstract] Abstract and results description: no equations for the scale-invariant model, the GW template (including the dissipative bulk-flow case), the likelihood, or the fitting procedure are provided, nor is any validation against known analytic limits or manual calculations, leaving the correctness of the reported best-fit values and audits unassessable.
- [Abstract] Abstract: the manuscript does not describe any cross-checks, error propagation, or sensitivity tests confirming that LLM-orchestrated steps (model construction, prior specification, constraint application) introduce no undetected errors, which directly affects the reliability of the claimed outputs.
minor comments (2)
- The abstract lists support for multiple LLM back-ends (Mistral, Anthropic, OpenAI, Ollama) but provides no usage examples or performance notes for the NANOGrav application.
- The workflow diagram or pseudocode for agent orchestration is not described, which would aid clarity even if the technical details are expanded elsewhere.
Simulated Author's Rebuttal
We thank the referee for their careful review and constructive feedback on the manuscript. We address each major comment below and indicate the revisions planned for the next version.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that the system 'identifies inconsistencies in some fits in the literature and produce novel ones' is presented without any specific examples, comparison tables, or quantitative differences from prior results, which is load-bearing for the asserted utility of the framework.
Authors: We agree that the abstract would benefit from concrete examples to support the claim. In the revised manuscript we will add a brief description of one specific literature inconsistency (a cited fit whose priors were flagged as internally inconsistent by the audit module) together with the quantitative shifts in best-fit parameters obtained with the dissipative bulk-flow template. Full comparison tables already appear in the results section and will be referenced from the abstract. revision: yes
-
Referee: [Abstract] Abstract and results description: no equations for the scale-invariant model, the GW template (including the dissipative bulk-flow case), the likelihood, or the fitting procedure are provided, nor is any validation against known analytic limits or manual calculations, leaving the correctness of the reported best-fit values and audits unassessable.
Authors: The referee is correct that the abstract and the summarized results description omit the explicit equations and validation steps. We will insert the key equations (scale-invariant potential, GW spectrum expressions for both templates, likelihood, and fitting procedure) into a new concise subsection of the results. We will also add explicit statements of the analytic-limit checks and manual cross-verifications that were performed on the pipeline outputs. revision: yes
-
Referee: [Abstract] Abstract: the manuscript does not describe any cross-checks, error propagation, or sensitivity tests confirming that LLM-orchestrated steps (model construction, prior specification, constraint application) introduce no undetected errors, which directly affects the reliability of the claimed outputs.
Authors: We acknowledge that the current text does not detail cross-checks or sensitivity tests for the LLM-driven components. In the revision we will add a dedicated validation subsection describing (i) manual inspection of a random sample of LLM-generated code segments, (ii) sensitivity tests on prior choices, and (iii) comparison of LLM-orchestrated versus fully manual runs for a subset of the NANOGrav fits. These additions will directly address the reliability concern. revision: yes
Circularity Check
Framework introduction exhibits no load-bearing circularity
full rationale
The paper presents DarkAgents as a new multi-agent LLM-plus-human-code framework for building astroparticle physics pipelines, with its primary output being best-fit parameters, constraints, and assumption audits for a scale-invariant model fitted to NANOGrav data. No derivation chain reduces a claimed prediction or first-principles result to its own inputs by construction, self-definition, or self-citation. The central value resides in the orchestration tool itself rather than in any fitted quantity or uniqueness theorem that would require external verification. Minor self-citation risk is present but not load-bearing for the framework claim.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
LeWRON: Agentic Analysis of Electroweak Phase Transitions
LeWRON is a new agentic framework that automates construction, auditing, and exploration of finite-temperature effective potentials and gravitational-wave predictions for electroweak phase transitions starting from an...
Reference graph
Works this paper leans on
-
[1]
R. Alves Batistaet al.,EuCAPT White Paper: Opportunities and Challenges for Theoretical Astroparticle Physics in the Next Decade(2021),2110.10074
arXiv 2021
-
[2]
P. Agrawal, N. Craig, A. Madden and I. V. Lombera,The FERMIACC: Agents for Particle Theory(2026),2603.22538
arXiv 2026
-
[3]
Z.-Y. Peng, H.-S. Yuan, Q. Lai, J.-Q. Jiang, G. Ye, J. Zhang and Y.-S. Piao,DeepIn- flation: an AI agent for research and model discovery of inflation(2026),2601.14288
Pith/arXiv arXiv 2026
-
[4]
S. Qiu, Z. Cai, J. Wei, Z. Li, Y. Yin, Q.-H. Cao, C. Liu, M.-x. Luo, X.-B. Yuan and H. X. Zhu,An End-to-end Architecture for Collider Physics and Beyond(2026), 2603.14553
arXiv 2026
-
[5]
Mudur, C
N. Mudur, C. Cuesta-Lazaro, M. W. Toomey and D. Finkbeiner,An llm-driven framework for cosmological model-building and exploration, InLLM for Scientific Discovery: Reasoning, Assistance, and Collaboration. 10 SciPost PhysicsReferences
-
[6]
The NANOGrav 15-year Data Set: Evidence for a Gravitational-Wave Background
G. Agazieet al.,The NANOGrav 15 yr Data Set: Evidence for a Gravitational-wave Background, Astrophys. J. Lett.951(1), L8 (2023), doi:10.3847/2041-8213/acdac6, 2306.16213. [7]Mistral,https://mistral.ai/, Accessed: 2026-06-09. [8]Claude,https://claude.com/, Accessed: 2026-06-09. [9]OpenAI,https://openai.com/, Accessed: 2026-06-09. [10]Ollama,https://ollama....
work page internal anchor Pith review Pith/arXiv arXiv doi:10.3847/2041-8213/acdac6 2023
-
[7]
J. Antoniadiset al.,The second data release from the European Pulsar Timing Array - III. Search for gravitational wave signals, Astron. Astrophys.678, A50 (2023), doi:10.1051/0004-6361/202346844,2306.16214
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1051/0004-6361/202346844 2023
-
[8]
H. Xuet al.,Searching for the Nano-Hertz Stochastic Gravitational Wave Background with the Chinese Pulsar Timing Array Data Release I, Res. Astron. Astrophys.23(7), 075024 (2023), doi:10.1088/1674-4527/acdfa5,2306.16216
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1088/1674-4527/acdfa5 2023
-
[9]
D. J. Reardonet al.,Search for an Isotropic Gravitational-wave Background with the Parkes Pulsar Timing Array, Astrophys. J. Lett.951(1), L6 (2023), doi:10.3847/2041-8213/acdd02,2306.16215
work page internal anchor Pith review Pith/arXiv arXiv doi:10.3847/2041-8213/acdd02 2023
-
[10]
S. Pascoli, S. Rosauro-Alcaraz and M. Zandi,Cosmological phase transitions: from particle physics to gravitational waves, semi-analytically(2026),2602.02829
arXiv 2026
-
[11]
F. Costa, J. Hoefken Zink, M. Lucente, S. Pascoli and S. Rosauro-Alcaraz,Supercooled dark scalar phase transitions explanation of NANOGrav data, Phys. Lett. B868, 139634 (2025), doi:10.1016/j.physletb.2025.139634,2501.15649
-
[12]
S. Balan, T. Bringmann, F. Kahlhoefer, J. Matuszak and C. Tasillo,Sub-GeV dark matter and nano-Hertz gravitational waves from a classically conformal dark sector, JCAP08, 062 (2025), doi:10.1088/1475-7516/2025/08/062,2502.19478
-
[13]
J. Gon¸ calves, D. Marfatia, A. P. Morais and R. Pasechnik,Supercooled phase transi- tions in conformal dark sectors explain NANOGrav data, Phys. Lett. B869, 139829 (2025), doi:10.1016/j.physletb.2025.139829,2501.11619
-
[14]
P. Athron, C. Bal´ azs, A. Fowlie, L. Morris and L. Wu,Cosmological phase transitions: From perturbative particle physics to gravitational waves, Prog. Part. Nucl. Phys.135, 104094 (2024), doi:10.1016/j.ppnp.2023.104094,2305.02357
-
[15]
A. Mitridate, D. Wright, R. von Eckardstein, T. Schr¨ oder, J. Nay, K. Olum, K. Schmitz and T. Trickle,PTArcade(2023),2306.16377
arXiv 2023
-
[16]
W. G. Lamb, S. R. Taylor and R. van Haasteren,Rapid refitting tech- niques for Bayesian spectral characterization of the gravitational wave back- ground using pulsar timing arrays, Phys. Rev. D108(10), 103019 (2023), doi:10.1103/PhysRevD.108.103019,2303.15442
-
[17]
M. Lewicki and V. Vaskonen,Impact of cosmic expansion on gravitational wave spectra from strongly supercooled first-order phase transitions(2025),2511.15687
arXiv 2025
-
[18]
R. Jinno, T. Konstandin, H. Rubira and I. Stomberg,Higgsless simulations of cosmological phase transitions and gravitational waves, JCAP02, 011 (2023), doi:10.1088/1475-7516/2023/02/011,2209.04369. 11 SciPost PhysicsReferences
-
[19]
C. Caprini, R. Jinno, M. Lewicki, E. Madge, M. Merchand, G. Nardini, M. Pieroni, A. Roper Pol and V. Vaskonen,Gravitational waves from first-order phase transitions in LISA: reconstruction pipeline and physics interpretation, JCAP10, 020 (2024), doi:10.1088/1475-7516/2024/10/020,2403.03723
-
[20]
A. Musumeci, J. Nava, S. Pascoli and F. Sala,Nanohertz gravitational waves from the baryon-dark matter coincidence(2026),2604.26860
Pith/arXiv arXiv 2026
-
[21]
Agrawalet al.,Feebly-interacting particles: FIPs 2020 workshop report, Eur
P. Agrawalet al.,Feebly-interacting particles: FIPs 2020 workshop report, Eur. Phys. J. C81(11), 1015 (2021), doi:10.1140/epjc/s10052-021-09703-7,2102.12143
-
[22]
Antelet al.,Feebly-interacting particles: FIPs 2022 Workshop Report, Eur
C. Antelet al.,Feebly-interacting particles: FIPs 2022 Workshop Report, Eur. Phys. J. C83(12), 1122 (2023), doi:10.1140/epjc/s10052-023-12168-5,2305.01715
-
[23]
Abdullahiet al.,From oversimplified to overlooked: The case for exploring rich dark sectors, Nucl
A. Abdullahiet al.,From oversimplified to overlooked: The case for exploring rich dark sectors, Nucl. Phys. B1020, 117148 (2025), doi:10.1016/j.nuclphysb.2025.117148,2505.05663
-
[24]
R. S. Sutton,The Bitter Lesson, Accessed: 2026-06-04 (2019). 12
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.