RooAgent: An LLM Agent for Root-Based High Energy Physics Analysis
Pith reviewed 2026-05-20 13:27 UTC · model grok-4.3
The pith
RooAgent lets an LLM agent invoke PyROOT functions to run high energy physics analyses from plain-language prompts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
RooAgent supplies PyROOT physics analysis functions as tools to an LLM agent that responds to plain-language prompts, supporting LangGraph and Model Context Protocol modes while keeping the analysis logic in PyROOT; the package is illustrated with Monte Carlo simulations of pp to ZH, multi-task signal-background workflows, toy statistical analyses, and an application to ATLAS open data for H to ZZ* to 4l.
What carries the argument
LLM agent that selects and supplies arguments to PyROOT analysis functions provided as tools
If this is right
- Users can request tasks such as histogram inspection or kinematic visualization directly in natural language.
- The system supports full workflows including event selection, fitting, and significance estimation on Monte Carlo and open data.
- Analysis logic remains in PyROOT while different LLM backends are interchangeable through the two supported modes.
- The package is demonstrated on standard ATLAS open data for the four-lepton Higgs channel.
Where Pith is reading between the lines
- If error rates remain low, the approach could shorten the time between idea and result for routine HEP tasks.
- Educational use might let students explore open data by describing questions rather than learning PyROOT syntax first.
- Extending the tool set to include more advanced statistical methods could test whether the same prompting style scales to full analyses.
Load-bearing premise
That an LLM can consistently select the correct analysis tools and supply accurate arguments for non-trivial physics workflows without introducing logical or numerical errors.
What would settle it
Execute a known workflow such as event selection and significance estimation on a ZH Monte Carlo sample both via the agent and via direct PyROOT code, then compare the resulting histograms, fit parameters, and significance values for agreement.
Figures
read the original abstract
We present RooAgent as a natural-language interface for Root-based high energy physics data analysis. The package provides physics analysis functions as tools that an LLM agent invokes in response to plain-language prompts. Two operating modes are supported: a LangGraph-based agent compatible with OpenAI's GPT-4.1 via GitHub Copilot and with DeepSeek-V3 via Ollama, and a Model Context Protocol server for use with the Anthropic Claude CLI (Sonnet~4.6). In both modes the analysis logic is implemented in PyRoot and the LLM selects tools and supplies the required arguments. The package supports histogram inspection, event selection, visualisation of kinematic distributions, fitting, and significance estimation, among other tasks. We illustrate RooAgent with tests based on Monte Carlo simulations of $pp\to ZH$ ($Z\to\ell^+\ell^-$, $H\to b\bar{b}$), a multi-task signal-background workflow, a toy statistical analysis, and an application to ATLAS open data for $H\to ZZ^*\to 4\ell$. The package is available on PyPI and the source code is hosted at https://github.com/amanmdesai/RooAgent.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents RooAgent, a natural-language interface for ROOT-based high-energy physics data analysis. Physics analysis functions are exposed as tools that an LLM agent invokes in response to plain-language prompts. Two operating modes are supported (LangGraph-based agent with GPT-4.1 or DeepSeek-V3, and Model Context Protocol server with Claude). The package implements tasks including histogram inspection, event selection, kinematic visualization, fitting, and significance estimation in PyROOT. The authors illustrate the tool with single-run examples on pp→ZH Monte Carlo, a multi-task signal-background workflow, toy statistics, and ATLAS open data for H→ZZ*→4ℓ.
Significance. If the agent reliably selects and parameterizes tools for non-trivial multi-step analyses, RooAgent could lower the barrier to ROOT usage in HEP and accelerate prototyping for both experts and newcomers. The open-source release on PyPI and GitHub is a clear practical strength. However, the absence of quantitative evaluation metrics makes it difficult to judge whether the claimed reliability holds for realistic workflows.
major comments (1)
- [Results / Illustrations] The central claim that the LLM agent produces correct tool calls and arguments for multi-step physics workflows (event selection, fitting, significance) rests on illustrative single-run examples only. No success rates, error distributions, repeated-trial statistics, or failure-case analysis are reported for the Monte Carlo ZH, multi-task, toy-statistics, or ATLAS open-data demonstrations. This omission directly limits assessment of whether the approach is practically reliable.
minor comments (2)
- A concise table listing all available tools, their required arguments, and example natural-language prompts would improve clarity and usability.
- [Abstract] The abstract and introduction could more explicitly state the scope limitations (e.g., that the current implementation targets common but not exhaustive ROOT tasks).
Simulated Author's Rebuttal
We thank the referee for their constructive feedback and for acknowledging the practical strengths of RooAgent, including its open-source availability. We address the single major comment below.
read point-by-point responses
-
Referee: [Results / Illustrations] The central claim that the LLM agent produces correct tool calls and arguments for multi-step physics workflows (event selection, fitting, significance) rests on illustrative single-run examples only. No success rates, error distributions, repeated-trial statistics, or failure-case analysis are reported for the Monte Carlo ZH, multi-task, toy-statistics, or ATLAS open-data demonstrations. This omission directly limits assessment of whether the approach is practically reliable.
Authors: We agree that the current presentation relies on single-run illustrative examples and lacks the quantitative metrics (success rates, error distributions, repeated-trial statistics) that would allow a more rigorous evaluation of reliability for non-trivial workflows. The manuscript frames these demonstrations as illustrations of functionality rather than as a statistical benchmark study. To address this limitation, we will revise the manuscript by adding a new subsection that reports observed success rates and common failure modes from repeated executions of the multi-task and ATLAS open-data workflows. This will include a concise table of results and a discussion of typical issues such as argument mis-specification in complex selections. revision: yes
Circularity Check
No significant circularity: software tool description
full rationale
The manuscript is a software engineering contribution that describes the RooAgent package, its two operating modes, supported physics analysis functions implemented in PyRoot, and illustrative examples on ZH Monte Carlo, multi-task workflows, toy statistics, and ATLAS open data. No equations, derivations, fitted parameters, predictions of new quantities, or load-bearing self-citations appear in the provided text. The central claims reduce only to the existence and functionality of the released code on PyPI and GitHub, which is externally verifiable and independent of any internal reduction. This is the expected finding for a non-mathematical tool paper.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We present RooAgent as a natural-language interface for Root-based high energy physics data analysis. The package provides physics analysis functions as tools that an LLM agent invokes in response to plain-language prompts.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The tool set aims to cover common tasks in Root-based analyses: file and tree inspection, histogram filling and visualisation, event counting and cutflow generation, significance calculation, parametric fitting...
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
R. Brun and F. Rademakers,ROOT — An object oriented data analysis framework,Nucl. Instrum. Meth. A389(1997) 81–86
work page 1997
- [2]
-
[3]
W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou et al.,A Survey of Large Language Models, 2303.18223
work page internal anchor Pith review Pith/arXiv arXiv
-
[4]
Toolformer: Language Models Can Teach Themselves to Use Tools
T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, E. Hambro et al.,Toolformer: Language models can teach themselves to use tools,Advances in Neural Information Processing Systems36(2023) 68539–68551, [2302.04761]
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[5]
Talm: Tool augmented language models.arXiv preprint arXiv:2205.12255, 2022
A. Parisi, Y. Zhao and N. Fiedel,TALM: Tool Augmented Language Models,2205.12255
-
[6]
A. Radovic, M. Williams, D. Rousseau, M. Kagan, D. Bonacorsi, A. Himmel et al.,Machine learning at the energy and intensity frontiers of particle physics,Nature560(2018) 41–48
work page 2018
-
[7]
Deep Learning and its Application to LHC Physics
D. Guest, K. Cranmer and D. Whiteson,Deep Learning and its Application to LHC Physics,Ann. Rev. Nucl. Part. Sci.68(2018) 161–181, [1806.11484]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[8]
Machine Learning in High Energy Physics Community White Paper
K. Albertsson et al.,Machine Learning in High Energy Physics Community White Paper,J. Phys. Conf. Ser.1085(2018) 022008, [1807.02876]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[9]
A Living Review of Machine Learning for Particle Physics
M. Feickert and B. Nachman,A Living Review of Machine Learning for Particle Physics, 2102.02770
- [10]
-
[11]
J. Jiao, T. Liu, K. Li, W. Song, Y. Liao, B. Zhang et al.,HepScript: A Dual-Use DSL for Human-AI Collaborative Data Analysis Workflows in High-Energy Physics,2605.01423
work page internal anchor Pith review Pith/arXiv arXiv
-
[12]
Zhang et al.,Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics,2404.08001
Z. Zhang et al.,Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics,2404.08001. [13]Electron-Positron Alliancecollaboration, A. Badea, Y. Chen, M. Maggi and Y.-J. Lee, Agentic AI – Physicist Collaboration in Experimental Particle Physics: A Proof-of-Concept Measurement with LEP Open Data,2603.05735. 24
- [13]
-
[14]
E. Gendreau-Distler, J. Ho, D. Kim, L. T. Le Pottier, H. Wang and C. Yang,Automating High Energy Physics Data Analysis with LLM-Powered Agents, in39th Annual Conference on Neural Information Processing Systems: Includes Machine Learning and the Physical Sciences (ML4PS), 12, 2025.2512.07785
-
[15]
S. Diefenbacher, A. Hallin, G. Kasieczka, M. Krämer, A. Lauscher and T. Lukas,Agents of Discovery,2509.08535
- [16]
-
[17]
J. Hill and H. J. Ryoo,GRACE: an Agentic AI for Particle Physics Experiment Design and Simulation, 1, 2026.2602.15039
- [18]
-
[19]
Y.-F. Lo, D. Kobylianskii and B. Nachman,An AI-based Detector Simulation and Reconstruction Model for the ALEPH Experiment at LEP,2604.11834
work page internal anchor Pith review Pith/arXiv arXiv
- [20]
-
[21]
J. Birk, G. Kasieczka, S. Mishra-Sharma, B. Nachman, D. Noll and T. Wamorkar,A Scientific Human-Agent Reproduction Pipeline,2604.18752
work page internal anchor Pith review Pith/arXiv arXiv
- [22]
-
[23]
M. He, F. Jiang, J. Jiao, M. Li, K. Li, Y. Liao et al.,Dr.Sai: An agentic AI for real-world physics analysis at BESIII,2604.22541
work page internal anchor Pith review Pith/arXiv arXiv
-
[24]
T. Plehn, D. Schiller and N. Schmal,MadAgents,2601.21015
work page internal anchor Pith review Pith/arXiv arXiv
-
[25]
Desai,amanmdesai/rooagent: 0.2.0,https://doi.org/10.5281/zenodo.20249499, May, 2026
A. Desai,amanmdesai/rooagent: 0.2.0,https://doi.org/10.5281/zenodo.20249499, May, 2026. 10.5281/zenodo.20249499
-
[26]
LangChain.ai Contributors,LangGraph: A low-level agent orchestration framework, https://github.com/langchain-ai/langgraph, 2024
work page 2024
-
[27]
OpenAI et al.,GPT-4 Technical Report,2303.08774
work page internal anchor Pith review Pith/arXiv arXiv
-
[28]
GitHub Copilot: AI-powered code assistance
GitHub, “GitHub Copilot: AI-powered code assistance.” https://github.com/features/copilot, 2025
work page 2025
-
[29]
Ollama Contributors,Ollama: A local large language model runtime,https://ollama.com, 2025
work page 2025
-
[30]
“Model context protocol.”https://modelcontextprotocol.io, 2024
work page 2024
-
[31]
Anthropic, “Claude 3 model card.”https://www.anthropic.com/claude, 2024
work page 2024
-
[32]
H. Chase,LangChain: A modular framework for language model applications, https://github.com/langchain-ai/langchain, 2022. 25
work page 2022
-
[33]
ChatOllama: Ollama model integration for LangChain
LangChain Community, “ChatOllama: Ollama model integration for LangChain.” https://python.langchain.com/docs/integrations/providers/ollama, 2025
work page 2025
-
[34]
W. McKinney,Data structures for statistical computing in Python, inProceedings of the 9th Python in Science Conference, pp. 56–61, 2010. DOI
work page 2010
-
[35]
C. R. Harris et al.,Array programming with NumPy,Nature585(2020) 357–362, [2006.10256]
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[36]
SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python
P. Virtanen et al.,SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python,Nature Meth.17(2020) 261–272, [1907.10121]
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[37]
J. D. Hunter,Matplotlib: A 2d graphics environment,Computing in Science & Engineering9 (2007) 90–95
work page 2007
-
[38]
DeepSeek-AI,DeepSeek-V3 Technical Report, 2024
work page 2024
-
[39]
J. Lowin,FastMCP: A fast, pythonic way to build MCP servers and clients, https://github.com/jlowin/fastmcp, 2024
work page 2024
- [40]
-
[41]
The RooFit toolkit for data modeling
W. Verkerke and D. P. Kirkby,The RooFit toolkit for data modeling,eConfC0303241(2003) MOLT007, [physics/0306116]
work page internal anchor Pith review Pith/arXiv arXiv 2003
-
[42]
J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer et al.,The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations,JHEP07(2014) 079, [1405.0301]. [44]PDF4LHC Working Groupcollaboration, R. D. Ball et al.,The PDF4LHC21 combination of global PDF fits...
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[43]
LHAPDF6: parton density access in the LHC precision era
A. Buckley, J. Ferrando, S. Lloyd, K. Nordström, B. Page, M. Rüfenacht et al.,LHAPDF6: parton density access in the LHC precision era,Eur. Phys. J. C75(2015) 132, [1412.7420]
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[44]
A comprehensive guide to the physics and usage of PYTHIA 8.3
C. Bierlich et al.,A comprehensive guide to the physics and usage of PYTHIA 8.3,SciPost Phys. Codeb.2022(2022) 8, [2203.11601]
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[45]
M. Cacciari, G. P. Salam and G. Soyez,FastJet User Manual,Eur. Phys. J. C72(2012) 1896, [1111.6097]
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[46]
Confronting new physics theories to LHC data with MadAnalysis 5
E. Conte and B. Fuks,Confronting new physics theories to LHC data with MADANALYSIS 5,Int. J. Mod. Phys. A33(2018) 1830027, [1808.00480]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[47]
The anti-k_t jet clustering algorithm
M. Cacciari, G. P. Salam and G. Soyez,The anti-kt jet clustering algorithm,JHEP04(2008) 063, [0802.1189]
work page internal anchor Pith review Pith/arXiv arXiv 2008
-
[48]
A standard format for Les Houches Event Files
J. Alwall et al.,A Standard format for Les Houches event files,Comput. Phys. Commun.176 (2007) 300–304, [hep-ph/0609017]
work page internal anchor Pith review Pith/arXiv arXiv 2007
-
[49]
Desai,LHEReader: Simplified Conversion from Les Houches Event Files to ROOT Format, 2603.01489
A. Desai,LHEReader: Simplified Conversion from Les Houches Event Files to ROOT Format, 2603.01489
- [50]
-
[51]
Asymptotic formulae for likelihood-based tests of new physics
G. Cowan, K. Cranmer, E. Gross and O. Vitells,Asymptotic formulae for likelihood-based tests of new physics,Eur. Phys. J. C71(2011) 1554, [1007.1727]. [54]ATLAScollaboration, G. Aad et al.,The ATLAS Experiment at the CERN Large Hadron Collider,JINST3(2008) S08003. [55]ATLAScollaboration,Review of the 13 TeV ATLAS Open Data release, . https://cds.cern.ch/r...
work page internal anchor Pith review Pith/arXiv arXiv 2011
-
[52]
ATLAS Collaboration, “ATLAS open data.”http://opendata.atlas.cern, 2020
work page 2020
-
[53]
HEP Software Foundation Training: Analysis preservation and open data
HSF Training Working Group, “HEP Software Foundation Training: Analysis preservation and open data.”https://hsf-training.github.io/hsf-training-matplotlib/, 2023. 27
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.