Recognition: unknown
Phenomenological Detector Design and Optimization in Vertically-Integrated Differentiable Full Simulations with Agentic-AI
Pith reviewed 2026-05-08 13:00 UTC · model grok-4.3
The pith
AI agents using current language models can optimize detector geometry, digitization, and reconstruction parameters together in a single differentiable simulation for high-energy physics.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors present the first implementation of AI agents for detector design in high-energy physics through a bilevel optimization framework. The framework vertically integrates detector geometry, front-end digitization, and high-level reconstruction algorithm parameters inside differentiable full simulations. On the concrete example of a dual-readout segmented crystal electromagnetic calorimeter, the agent simultaneously tunes crystal granularity and length, number of ADC bits, sampling rate, and center-of-gravity hit-clustering radius. The work finds that today’s LLM-based reasoning models, without added experiment-specific context, can execute these complex workflows and suggest relevant
What carries the argument
Bilevel optimization framework that vertically integrates detector geometry, front-end digitization, and reconstruction parameters inside a differentiable full simulation, with an LLM-based agent directing the outer loop.
If this is right
- Simultaneous optimization of geometry, digitization, and reconstruction parameters becomes feasible in one run instead of sequential manual stages.
- Labor and compute required for exploring detector design spaces are reduced.
- Computational checks of first-principles design choices become routine rather than exceptional.
- Generic but relevant avenues for further study are identified automatically by the agent.
Where Pith is reading between the lines
- The same vertically integrated agent loop could be applied to other detector subsystems such as trackers or muon systems to test broader applicability.
- Adding modest experiment-specific context or fine-tuning the agent might allow it to make physics-motivated leaps that the current work explicitly does not claim.
- Repeated runs on varied calorimeter layouts would provide a direct test of how sensitive the observed optimizations are to the baseline design.
Load-bearing premise
Current LLM-based reasoning models can carry out complex multi-layer detector optimization workflows and suggest relevant improvements without being supplied additional experiment-specific context.
What would settle it
Running the same agent-driven workflow on the dual-readout calorimeter and finding that it fails to reduce any of the listed parameters or produces no relevant improvement suggestions would show the claimed effectiveness does not hold.
Figures
read the original abstract
We present the first implementation of AI agents into the design and optimization of detectors in high-energy physics experiments via a bilevel optimization framework that vertically integrates detector geometry, front-end digitization, and high-level reconstruction algorithm parameters in differentiable full simulations. Using the example of a dual-readout, segmented crystal EM calorimeter with a baseline resolution of $3\%/\sqrt{E}$, we investigate the capabilities and value propositions of AI agents in the identification and reduction of key detector parameters and in the nonlinear traversal of a given detector design's full parameter space. We find that LLM-based reasoning models today, without being given additional experiment-specific context, are able to effectively execute complex workflows and proactively suggest generic but relevant avenues for further study or improvement. Here, we demonstrate an AI agent's ability to use the workflow to simultaneously optimize a representative subset of vertically integrated detector parameters: crystal granularity and length, number of ADC bits and sampling rate, and center-of-gravity hit-clustering radius. We find that effective integration of agents into the complex workflows of frontier areas of research not only significantly reduces labor and compute, but opens up efficient avenues for computational validation of first-principles design choices. While the ability to make autonomous leaps of physics-motivated judgment or insight is not demonstrated in this work, this study defines the current frontier of experimental design methods in high-energy physics.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims to present the first implementation of AI agents in a bilevel optimization framework for detector design in high-energy physics. It vertically integrates detector geometry, front-end digitization, and high-level reconstruction parameters within differentiable full simulations. Demonstrated on a dual-readout segmented crystal EM calorimeter with baseline resolution 3%/√E, LLM-based agents are shown to execute workflows optimizing crystal granularity and length, ADC bits, sampling rate, and clustering radius, with the agents able to proactively suggest improvements without experiment-specific context; the work concludes that this reduces labor/compute and enables computational validation of design choices, while noting that autonomous physics insight is not demonstrated.
Significance. If the central claims are supported by quantitative evidence, the approach could meaningfully advance phenomenological detector optimization in HEP by enabling efficient traversal of high-dimensional parameter spaces that combine geometry, electronics, and reconstruction. The vertical integration via differentiable simulations and agentic workflows is a novel methodological contribution that could lower the cost of exploring design trade-offs. The absence of benchmarks, however, currently limits the assessed significance to a proof-of-concept demonstration rather than a validated new capability.
major comments (3)
- [Abstract] Abstract and results demonstration: the claims that agents 'effectively execute complex workflows' and 'significantly reduces labor and compute' rest on a single qualitative run with no reported quantitative metrics (task success rate, number of iterations, final resolution vs. the stated 3%/√E baseline, or comparison to manual/traditional optimization baselines).
- [Methodology] Bilevel optimization description: the framework is stated to operate on externally defined simulation parameters, but no explicit equations, pseudocode, or convergence criteria are supplied showing how the upper-level agent decisions interact with the lower-level differentiable simulation or how gradients are propagated across the full chain (geometry to digitization to reconstruction).
- [Results] Parameter optimization results: the simultaneous optimization of granularity, length, ADC bits, sampling rate, and clustering radius is presented without error bars, statistical validation, or sensitivity analysis, making it impossible to assess whether the reported improvements are robust or merely anecdotal.
minor comments (3)
- [Abstract] The title and abstract use 'Agentic-AI' without a concise definition or reference; a brief clarification in the introduction would aid readers unfamiliar with the term.
- [Results] Specific examples of the agent's 'proactively suggest generic but relevant avenues' would strengthen the narrative; quoting or tabulating one or two agent-generated suggestions would make the claim more concrete.
- [Introduction] The manuscript would benefit from a short related-work paragraph situating the bilevel agent approach against prior ML-assisted detector optimization studies.
Simulated Author's Rebuttal
We thank the referee for the thorough and constructive review. The comments highlight important areas where the manuscript can be strengthened to better support its claims as a proof-of-concept. We address each major comment below and commit to revisions that add clarity, formalism, and quantitative support without overstating the current demonstration.
read point-by-point responses
-
Referee: [Abstract] Abstract and results demonstration: the claims that agents 'effectively execute complex workflows' and 'significantly reduces labor and compute' rest on a single qualitative run with no reported quantitative metrics (task success rate, number of iterations, final resolution vs. the stated 3%/√E baseline, or comparison to manual/traditional optimization baselines).
Authors: We agree that the abstract and results section would benefit from quantitative backing. In the revised manuscript we will report the number of agent iterations, a simple success metric for workflow completion, the final achieved resolution relative to the 3%/√E baseline, and a qualitative but explicit comparison of labor and compute effort versus a manual optimization workflow. These additions will be placed in both the abstract and a new subsection of the results. revision: yes
-
Referee: [Methodology] Bilevel optimization description: the framework is stated to operate on externally defined simulation parameters, but no explicit equations, pseudocode, or convergence criteria are supplied showing how the upper-level agent decisions interact with the lower-level differentiable simulation or how gradients are propagated across the full chain (geometry to digitization to reconstruction).
Authors: The referee is correct that the current text lacks formal specification. We will add a dedicated subsection containing (i) the bilevel optimization objective in mathematical form, (ii) pseudocode for the agent–simulation loop, and (iii) a description of gradient flow through the vertically integrated chain. Convergence criteria used in the demonstration will also be stated explicitly. revision: yes
-
Referee: [Results] Parameter optimization results: the simultaneous optimization of granularity, length, ADC bits, sampling rate, and clustering radius is presented without error bars, statistical validation, or sensitivity analysis, making it impossible to assess whether the reported improvements are robust or merely anecdotal.
Authors: We acknowledge that the presented results are from a single illustrative run. The revised results section will include error bars derived from repeated simulations where computationally feasible, a brief sensitivity study on the most influential parameters, and an explicit statement that the work is intended as a proof-of-concept rather than a statistically exhaustive benchmark. Full statistical validation across many random seeds will be noted as future work. revision: partial
Circularity Check
No circularity: framework uses external parameters and qualitative demonstration
full rationale
The paper presents a bilevel optimization framework that vertically integrates detector geometry, digitization, and reconstruction parameters inside differentiable simulations, using an example dual-readout calorimeter with a stated baseline resolution of 3%/√E. All optimized quantities (granularity, length, ADC bits, sampling rate, clustering radius) are externally defined inputs to the simulation rather than quantities derived or fitted inside the same equations. No predictions are renamed as outputs of a fit, no uniqueness theorems are imported via self-citation, and no ansatz is smuggled through prior work. The demonstration is described as a qualitative exploration of agent workflows; the central claim therefore remains self-contained against external benchmarks and does not reduce to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Differentiable full simulations can accurately integrate detector geometry, front-end digitization, and high-level reconstruction algorithm parameters.
Reference graph
Works this paper leans on
-
[1]
Butler, R
Joel N. Butler, R. Sekhar Chivukula, André de Gouvêa, Tao Han, Young-Kee Kim, Priscilla Cushman, Glennys R. Farrar, Yury G. Kolomensky, Sergei Nagaitsev, Nicolás Yunes, Stephen Gourlay, Tor Raubenheimer, Vladimir Shiltsev, Kétévi A. Assamagan, Breese Quinn, V . Daniel Elvira, Steven Gottlieb, Benjamin Nachman, Aaron S. Chou, Marcelle Soares-Santos, Tim M....
2021
-
[2]
Hitoshi Murayama, Shoji Asai, Karsten Heeger, Amalia Ballarino, Tulika Bose, Kyle Cranmer, Francis-Yan Cyr-Racine, Sarah Demers, Cameron Geddes, Yuri Gershtein, Beate Heinemann, JoAnne Hewett, Patrick Huber, Kendall Mahn, Rachel Mandelbaum, Jelena Maricic, Petra Merkel, Christopher Monahan, Peter Onyisi, Mark Palmer, Tor Raubenheimer, Mayly Sanchez, Richa...
2023
-
[3]
Technical report, Monte Verità/Ascona, Switzerland, 2025
The European Strategy for Particle Physics: 2026 Update - Recommendations by the European Strategy Group. Technical report, Monte Verità/Ascona, Switzerland, 2025
2026
-
[4]
Machine learning in high energy physics community white paper, 2019
Kim Albertsson et al. Machine learning in high energy physics community white paper, 2019
2019
-
[5]
Building an ai-native research ecosystem for experimental particle physics: A community vision, 2026
Thea Klaeboe Aarrestad et al. Building an ai-native research ecosystem for experimental particle physics: A community vision, 2026
2026
-
[6]
Smart pixel sensors: towards on-sensor filtering of pixel clusters with deep learning.Machine Learning: Science and Technology, 5(3):035047, aug 2024
Jieun Yoo, Jennet Dickinson, Morris Swartz, Giuseppe Di Guglielmo, Alice Bean, Douglas Berry, Manuel Blanco Valentin, Karri DiPetrillo, Farah Fahim, Lindsey Gray, James Hirschauer, Shruti R Kulkarni, Ron Lipton, Petar Maksimovic, Corrinne Mills, Mark S Neubauer, Benjamin 7 Parpillon, Gauri Pradhan, Chinar Syal, Nhan Tran, Dahai Wen, and Aaron Young. Smart...
2024
- [7]
-
[8]
Differentiable full detector simulation of a projective dual-readout crystal electromagnetic calorimeter with longitudinal segmentation and precision timing, 2024
Wonyong Chung. Differentiable full detector simulation of a projective dual-readout crystal electromagnetic calorimeter with longitudinal segmentation and precision timing, 2024
2024
-
[9]
Strong, Mia Tosi, Andrey Ustyuzhanin, Pietro Vischia, and Hevjin Yarar
Atılım Güne¸ s Baydin, Kyle Cranmer, Pablo de Castro Manzano, Christophe Delaere, Denis Derkach, Julien Donini, Tommaso Dorigo, Andrea Giammanco, Jan Kieseler, Lukas Layer, Gilles Louppe, Fedor Ratnikov, Giles C. Strong, Mia Tosi, Andrey Ustyuzhanin, Pietro Vischia, and Hevjin Yarar. Toward machine learning optimization of experimental design.Nuclear Phys...
2021
-
[10]
Mode: Machine-learning optimized design of experiments
MODE Collaboration. Mode: Machine-learning optimized design of experiments. https: //mode-collaboration.github.io/, 2026
2026
-
[11]
Belén Barreiro, Anastasios Belias, Alexey Boldyrev, Florian Bury, Susana Cebrian, Alexander Demin, Jennet Dickinson, Julien Donini, Tommaso Dorigo, Michele Doro, Nicolas R
Max Aehle, Lorenzo Arsini, R. Belén Barreiro, Anastasios Belias, Alexey Boldyrev, Florian Bury, Susana Cebrian, Alexander Demin, Jennet Dickinson, Julien Donini, Tommaso Dorigo, Michele Doro, Nicolas R. Gauger, Andrea Giammanco, Lindsey Gray, Borja S. González, Verena Kain, Jan Kieseler, Lisa Kusch, Marcus Liwicki, Gernot Maier, Federico Nardi, Fedor Ratn...
2025
-
[12]
Physics instrument design with reinforce- ment learning, 2024
Shah Rukh Qasim, Patrick Owen, and Nicola Serra. Physics instrument design with reinforce- ment learning, 2024
2024
-
[13]
Synthetic training and representation bridging in reconstruction domains, 2025
Wonyong Chung. Synthetic training and representation bridging in reconstruction domains, 2025
2025
-
[14]
Gauger, Enrico Lupi, Federico Nardi, Xuan Tung Nguyen, Fredrik Sandin, Joseph Willmore, and Pietro Vischia
Kylian Schmidt, Nikhil Kota, Jan Kieseler, Andrea De Vita, Markus Klute, Abhishek, Max Aehle, Muhammad Awais, Alessandro Breccia, Riccardo Carroccio, Long Chen, Tommaso Dorigo, Nicolas R. Gauger, Enrico Lupi, Federico Nardi, Xuan Tung Nguyen, Fredrik Sandin, Joseph Willmore, and Pietro Vischia. End-to-end detector optimization with diffusion models: A cas...
2025
-
[15]
M. Diefenthaler, C. Fanelli, L. O. Gerlach, W. Guan, T. Horn, A. Jentsch, M. Lin, K. Nagai, H. Nayak, C. Pecar, K. Suresh, A. V ossen, T. Wang, and T. Wenaus. AI-Assisted Detector Design for the EIC (AID 2E). InProceedings of the AI4EIC 2023 Workshop, 2023. https: //arxiv.org/abs/2405.16279
-
[16]
Schwartz
Matthew D. Schwartz. Resummation of the c-parameter sudakov shoulder using effective field theory, 2026
2026
-
[17]
Moreno, Samuel Bright-Thonney, Andrzej Novak, Dolores Garcia, and Philip Harris
Eric A. Moreno, Samuel Bright-Thonney, Andrzej Novak, Dolores Garcia, and Philip Harris. AI Agents Can Already Autonomously Perform Experimental High Energy Physics. 3 2026
2026
-
[18]
bilevel_det_opt: Bilevel optimization of detector geometry and reconstruction algorithm parameters
Wonyong Chung. bilevel_det_opt: Bilevel optimization of detector geometry and reconstruction algorithm parameters. https://github.com/wonyongc/bilevel_det_opt, 2026. Ac- cessed: 2026-04-16
2026
-
[19]
DD4hep: A Detector Description Toolkit for High Energy Physics Experiments.Journal of Physics: Conference Series, 513(2):022010, jun 2014
M Frank, F Gaede, C Grefe, and P Mato. DD4hep: A Detector Description Toolkit for High Energy Physics Experiments.Journal of Physics: Conference Series, 513(2):022010, jun 2014
2014
-
[20]
Lucchini, Wonyong Chung, Sarah C
Marco T. Lucchini, Wonyong Chung, Sarah C. Eno, Yihui Lai, Lorenzo Lucchini, Minh-Thi Nguyen, and Christopher G. Tully. New perspectives on segmented crystal calorimeters for future colliders.JINST, 15(11):P11005, 2020. 8
2020
-
[21]
Dual-readout calorimetry.Rev
Sehwook Lee, Michele Livan, and Richard Wigmans. Dual-readout calorimetry.Rev. Mod. Phys., 90:025002, Apr 2018
2018
-
[22]
R. Hirosky, T. Anderson, G. Cummings, M. Dubnowski, C. Guinto-Brody, Y . Guo, A. Ledovskoy, D. Levin, C. Madrid, C. Martin, and J. Zhu. Dual-readout calorimetry with homogeneous crystals. InProceedings of CALOR2024, EPJ Web of Conferences, 2024. arXiv:2408.11973
-
[23]
S. Eno, L. Wu, M. Y . Aamir, S. V . Chekanov, S. Nabili, and C. Palmer. On the resolution of dual readout calorimeters.Nucl. Instrum. Meth. A, 1083:171080, 2026
2026
-
[24]
Agostinelli, J
S. Agostinelli, J. Allison, K. Amako, J. Apostolakis, H. Araujo, P. Arce, M. Asai, D. Axen, S. Banerjee, G. Barrand, F. Behner, L. Bellagamba, J. Boudreau, L. Broglia, A. Brunengo, H. Burkhardt, S. Chauvie, J. Chuma, R. Chytracek, G. Cooperman, G. Cosmo, P. Degtyarenko, A. Dell’Acqua, G. Depaola, D. Dietrich, R. Enami, A. Feliciello, C. Ferguson, H. Fesef...
2003
-
[25]
Scifi: A safe, lightweight, user-friendly, and fully autonomous agentic ai workflow for scientific applications, 2026
Qibin Liu and Julia Gonski. Scifi: A safe, lightweight, user-friendly, and fully autonomous agentic ai workflow for scientific applications, 2026. 9
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.