Experiment-as-Code Labs: A Declarative Stack for AI-Driven Scientific Discovery
Pith reviewed 2026-05-21 00:07 UTC · model grok-4.3
The pith
Experiments are encoded as declarative configurations that AI agents generate and systems compile to lab device APIs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose a new paradigm called Experiment-as-Code (EaC) Labs, where a core concept is to encode experiments as declarative configurations that can be compiled down to device-level APIs. AI agents come up with hypotheses and experiments, written as an ensemble of declarative configurations. The systems layer performs program analysis, safety checks, resource assignment, and job orchestration. Finally, programmatic experimentation occurs via actuating the device APIs. This is a general stack that is science-, lab-, and instrument-independent.
What carries the argument
The Experiment-as-Code (EaC) declarative configuration stack, which encodes experiments so that AI-generated plans can be analyzed, safety-checked, and compiled to device APIs.
If this is right
- AI agents can directly propose and execute experiments in physical labs rather than only in simulation.
- Safety verification and resource scheduling occur automatically before any device actuation.
- The same configuration format works across different scientific domains and instrument types.
- Experiment design shifts from writing custom control code to specifying declarative ensembles.
Where Pith is reading between the lines
- AI could adjust ongoing physical experiments in response to live sensor data by modifying the active declarative configuration.
- Standard declarative experiment descriptions might allow easier sharing and reuse of protocols between independent labs.
- The approach could lower the barrier for non-experts to run complex protocols by letting the AI and stack handle the details.
Load-bearing premise
Complex real-world experiments can be fully captured and executed safely by AI-generated declarative configurations without heavy domain-specific tuning or loss of needed flexibility.
What would settle it
A test in which an AI agent outputs a declarative configuration for a multi-step lab protocol, the system compiles and runs it on actual equipment, safety checks pass, and the physical results align with the intended hypothesis.
read the original abstract
To unleash the full potential of AI for Science, we must untether the agents from a purely digital environment. The agent's ability to control and explore in real-world labs is essential because the physical lab remains foundational to scientific discovery. While some tasks can be performed on a computer (e.g., data analysis, running simulated experiments), Eureka moments could occur at any time while operating lab instruments (e.g., when a scientist notices unexpected clues, intuition may prompt a real-time course change). Although autonomous labs are on the rise, which expose programmable APIs to control scientific instruments via software, bridging the gap between increasingly powerful AI agents and automated lab equipment requires innovation that draws insights from computer systems. We propose a new paradigm called ``Experiment-as-Code (EaC) Labs,'' where a core concept is to encode experiments as declarative configurations that can be compiled down to device-level APIs. AI agents come up with hypotheses and experiments, written as an ensemble of declarative configurations. The systems layer performs program analysis, safety checks, resource assignment, and job orchestration. Finally, programmatic experimentation occurs via actuating the device APIs. This is a general stack that is science-, lab-, and instrument-independent, representing a novel synthesis across the physical, systems, and intelligence layers to unleash the next breakthrough in AI for Science.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a new paradigm called 'Experiment-as-Code (EaC) Labs' to bridge AI agents with physical laboratory equipment. Experiments are encoded as declarative configurations that AI agents generate as ensembles; these are compiled to device-level APIs. A systems layer performs program analysis, safety checks, resource assignment, and job orchestration before actuation occurs. The stack is presented as general and independent of specific science domains, labs, or instruments, synthesizing physical, systems, and intelligence layers to enable more autonomous discovery.
Significance. If the declarative stack can be realized with the claimed flexibility, it would offer a valuable architectural synthesis for AI-for-Science systems, potentially allowing agents to control real-world experiments while incorporating safety and orchestration. The high-level proposal correctly identifies the gap between digital AI agents and programmable lab APIs as a systems problem.
major comments (2)
- [Abstract] Abstract: The motivation explicitly cites the need to support real-time course changes (e.g., 'Eureka moments' and 'intuition may prompt a real-time course change' during instrument operation), yet the declarative-configuration model is described only as static ensembles compiled to APIs; no mechanisms are given for runtime reconfiguration, live sensor-driven branching, or AI-driven mid-execution updates.
- [Abstract] Abstract: The central claim that the stack is 'science-, lab-, and instrument-independent' is load-bearing for the proposal's generality, but the description provides no concrete account of how declarative encodings or the systems layer accommodate heterogeneous device APIs and domain-specific constraints without per-instrument or per-domain extensions.
minor comments (1)
- The manuscript is entirely conceptual and contains no implementation sketch, pseudocode, or worked example of a declarative configuration; adding at least one such illustration would substantially improve clarity.
Simulated Author's Rebuttal
We thank the referee for their constructive report and positive assessment of the proposal's potential. We address each major comment point-by-point below, indicating where revisions will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: The motivation explicitly cites the need to support real-time course changes (e.g., 'Eureka moments' and 'intuition may prompt a real-time course change' during instrument operation), yet the declarative-configuration model is described only as static ensembles compiled to APIs; no mechanisms are given for runtime reconfiguration, live sensor-driven branching, or AI-driven mid-execution updates.
Authors: We agree that the abstract's motivation emphasizes real-time adaptability during physical experimentation, while the core description focuses on declarative ensemble generation, compilation, and static orchestration. The systems layer is intended to support dynamic elements through ongoing program analysis and safety re-evaluation, but explicit mechanisms for runtime reconfiguration are not detailed in the current version. In revision, we will expand the systems layer section to outline sensor-driven branching via live feedback loops and AI-agent-triggered mid-execution updates, with re-application of safety checks and resource re-assignment. revision: yes
-
Referee: [Abstract] Abstract: The central claim that the stack is 'science-, lab-, and instrument-independent' is load-bearing for the proposal's generality, but the description provides no concrete account of how declarative encodings or the systems layer accommodate heterogeneous device APIs and domain-specific constraints without per-instrument or per-domain extensions.
Authors: The referee correctly notes that the independence claim requires a more explicit technical basis. The manuscript positions declarative configurations as an abstraction that the systems layer compiles to device APIs using program analysis for safety and resource assignment. However, concrete details on handling heterogeneity without extensions are limited. We will revise by adding a description of a unified declarative schema with modular mapping rules and abstract resource models in the systems layer, enabling accommodation of diverse APIs and constraints at the orchestration level rather than through per-domain core changes. revision: yes
Circularity Check
No circularity: architectural proposal without derivations or self-referential predictions
full rationale
The paper proposes a conceptual paradigm called Experiment-as-Code Labs, encoding experiments as declarative configurations compiled to device APIs, with AI agents generating hypotheses and a systems layer handling analysis and orchestration. No equations, fitted parameters, predictions, or derivation chains are present in the provided text or abstract. The central claims concern system architecture and generality across labs rather than any quantity or result that reduces to its own inputs by construction. Self-citations are absent from the load-bearing claims, and the proposal does not invoke uniqueness theorems or ansatzes from prior work. This is a standard non-finding for a design-oriented systems paper.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption AI agents can reliably produce declarative experiment configurations that capture necessary experimental intent and safety constraints
- domain assumption A general systems layer can perform program analysis, safety checks, and orchestration across arbitrary labs and instruments
invented entities (1)
-
Experiment-as-Code Labs declarative stack
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Hawkes, Nigel , title =. 2012 , doi =. https://www.bmj.com/content/344/bmj.e2555.full.pdf , journal =
work page 2012
-
[2]
Baker, Monya , title=. Nature , year=. doi:10.1038/533452a , url=
-
[3]
Wolfram language: Programming language + built-in knowledge , url=
-
[4]
Yang, Zhenning and Bhatnagar, Archit and Qiu, Yiming and Miao, Tongyuan and Tser Jern Kon, Patrick and Xiao, Yunming and Huang, Yibo and Casado, Martin and Chen, Ang , title =. 2025 , publisher =
work page 2025
-
[5]
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery , author=. 2024 , eprint=
work page 2024
- [6]
-
[7]
arXiv preprint arXiv:2405.17044 , year=
Generation and human-expert evaluation of interesting research ideas using knowledge graphs and large language models , author=. arXiv preprint arXiv:2405.17044 , year=
-
[8]
Scienceagentbench: Toward rigorous assessment of language agents for data-driven scientific discovery , author=. arXiv preprint arXiv:2410.05080 , year=
-
[9]
Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation , author=. 2024 , eprint=
work page 2024
-
[10]
arXiv preprint arXiv:2507.02554 , year=
AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench , author=. arXiv preprint arXiv:2507.02554 , year=
-
[11]
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search , author=. 2025 , eprint=
work page 2025
-
[12]
AI-Researcher: Autonomous Scientific Innovation , author=. 2025 , eprint=
work page 2025
-
[13]
When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research , author=. 2025 , eprint=
work page 2025
-
[14]
Large Language Models Penetration in Scholarly Writing and Peer Review , author=. 2025 , eprint=
work page 2025
-
[15]
Computer Methods and Programs in Biomedicine Update , volume=
Using artificial intelligence in academic writing and research: An essential productivity tool , author=. Computer Methods and Programs in Biomedicine Update , volume=. 2024 , publisher=
work page 2024
-
[16]
Advances in Simulation , volume=
Artificial intelligence-assisted academic writing: recommendations for ethical use , author=. Advances in Simulation , volume=. 2025 , publisher=
work page 2025
-
[17]
BMC Medical Education , volume=
Exploring the potential of artificial intelligence to enhance the writing of english academic papers by non-native english-speaking medical students-the educational application of ChatGPT , author=. BMC Medical Education , volume=. 2024 , publisher=
work page 2024
- [18]
-
[19]
Nature Communications , volume=
Evaluating large language model agents for automation of atomic force microscopy , author=. Nature Communications , volume=. 2025 , publisher=
work page 2025
-
[20]
Autonomous chemical research with large language models , author=. Nature , volume=. 2023 , publisher=
work page 2023
-
[21]
BLADE: Benchmarking Language Model Agents for Data-Driven Science , author=. 2025 , eprint=
work page 2025
-
[22]
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning , author=. 2024 , eprint=
work page 2024
-
[23]
SciAgents: automating scientific discovery through bioinspired multi-agent intelligent graph reasoning , author=. Advanced Materials , volume=. 2025 , publisher=
work page 2025
-
[24]
H yp ER : Literature-grounded Hypothesis Generation and Distillation with Provenance
Vasu, Rosni and Basu, Chandrayee and Dalvi Mishra, Bhavana and Sarasua, Cristina and Clark, Peter and Bernstein, Abraham. H yp ER : Literature-grounded Hypothesis Generation and Distillation with Provenance. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025. doi:10.18653/v1/2025.emnlp-main.1292
-
[25]
Su, Haoyang and Chen, Renqi and Tang, Shixiang and Yin, Zhenfei and Zheng, Xinzhe and Li, Jinzhe and Qi, Biqing and Wu, Qi and Li, Hui and Ouyang, Wanli and Torr, Philip and Zhou, Bowen and Dong, Nanqing. Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM -Based Multi-Agent System. Proceedings of the 63rd Annual Meeting of the As...
-
[26]
Hypothesis Generation with Large Language Models , url=
Zhou, Yangqiaoyu and Liu, Haokun and Srivastava, Tejes and Mei, Hongyuan and Tan, Chenhao , year=. Hypothesis Generation with Large Language Models , url=. doi:10.18653/v1/2024.nlp4science-1.10 , booktitle=
-
[27]
Litllm: A toolkit for scientific literature review.arXiv preprint arXiv:2402.01788, 2024
Litllm: A toolkit for scientific literature review , author=. arXiv preprint arXiv:2402.01788 , year=
-
[28]
Paperqa: Retrieval-augmented generative agent for scientific research,
Paperqa: Retrieval-augmented generative agent for scientific research , author=. arXiv preprint arXiv:2312.07559 , year=
-
[29]
Agent Laboratory: Using LLM Agents as Research Assistants , author=. 2025 , eprint=
work page 2025
-
[30]
Advanced Materials Technologies , year =
Scientific Discovery at the Press of a Button: Navigating Automated and Cloud Laboratory Landscapes , author =. Advanced Materials Technologies , year =
-
[31]
Automation in the Life Science Research Laboratory , author =. PLoS ONE , year =
- [32]
-
[33]
LabLynx Laboratory Software Suites: Transforming Your Lab to Meet Every Need , author =. 2025 , url =
work page 2025
-
[34]
Top 15 LIMS Vendors in 2026: Based on Real User Reviews , author =. 2026 , url =
work page 2026
- [35]
-
[36]
Automated Cloud Infrastructure-as-Code Reconciliation with AI Agents , author=. 2025 , eprint=
work page 2025
-
[37]
Yang, Zhenning and Bhatnagar, Archit and Qiu, Yiming and Miao, Tongyuan and Tser Jern Kon, Patrick and Xiao, Yunming and Huang, Yibo and Casado, Martin and Chen, Ang , title =. 2025 , publisher =. doi:10.1145/3759441.3759443 , journal =
-
[38]
Automated Bug Discovery in Cloud Infrastructure-as-Code Updates with LLM Agents , year=
Xiang, Yiming and Yang, Zhenning and Peng, Jingjia and Bauer, Hermann and Kon, Patrick Tser Jern and Qiu, Yiming and Chen, Ang , booktitle=. Automated Bug Discovery in Cloud Infrastructure-as-Code Updates with LLM Agents , year=
-
[39]
Automated Lifting for Cloud Infrastructure-as-Code Programs , year=
Peng, Jingjia and Qiu, Yiming and Kon, Patrick Tser Jern and Zhao, Pinhan and Huang, Yibo and Guo, Zheng and Wang, Xinyu and Chen, Ang , booktitle=. Automated Lifting for Cloud Infrastructure-as-Code Programs , year=
-
[40]
Artificial intelligence in drug development
Zhang, Kang and Yang, Xin and Wang, Yifei and Yu, Yunfang and Huang, Niu and Li, Gen and Li, Xiaokun and Wu, Joseph C and Yang, Shengyong. Artificial intelligence in drug development. Nat. Med
-
[41]
Machine Learning for Molecular Simulation
Noé, Frank and Tkatchenko, Alexandre and Müller, Klaus-Robert and Clementi, Cecilia. Machine Learning for Molecular Simulation. Annual Review of Physical Chemistry. 2020. doi:https://doi.org/10.1146/annurev-physchem-042018-052331
-
[42]
AI4Materials: Transforming the landscape of materials science and enigneering , journal =
Xue Jiang and Dezhen Xue and Yang Bai and William Yi Wang and Jianjun Liu and Mingli Yang and Yanjing Su , keywords =. AI4Materials: Transforming the landscape of materials science and enigneering , journal =. 2025 , issn =. doi:https://doi.org/10.1016/j.revmat.2025.100010 , url =
-
[43]
Cheetham, Anthony K. and Seshadri, Ram , title =. Chemistry of Materials , volume =. 2024 , doi =
work page 2024
-
[44]
Scaling deep learning for materials discovery , volume =
Merchant, Amil and Batzner, Simon and Schoenholz, Samuel and Aykol, Muratahan and Cheon, Gowoon and Cubuk, Ekin , year =. Scaling deep learning for materials discovery , volume =. Nature , doi =
-
[45]
Accurate structure prediction of biomolecular interactions with AlphaFold 3
Abramson, Josh and Adler, Jonas and Dunger, Jack and Evans, Richard and Green, Tim and Pritzel, Alexander and Ronneberger, Olaf and Willmore, Lindsay and Ballard, Andrew J and Bambrick, Joshua and Bodenstein, Sebastian W and Evans, David A and Hung, Chia-Chun and O'Neill, Michael and Reiman, David and Tunyasuvunakool, Kathryn and Wu, Zachary and Z emgulyt...
-
[46]
Nature Communications , author =
Autonomous platforms for data-driven organic synthesis , volume =. Nature Communications , author =. 2022 , note =. doi:10.1038/s41467-022-28736-4 , abstract =
-
[47]
Empowering scientists with data-driven automated experimentation , copyright =. Nature Synthesis , author =. 2023 , note =. doi:10.1038/s44160-023-00337-z , abstract =
-
[48]
The rise of self-driving labs in chemical and materials sciences , copyright =. Nature Synthesis , author =. 2023 , note =. doi:10.1038/s44160-022-00231-0 , abstract =
-
[49]
Chemistry of Materials , author =
Self-. Chemistry of Materials , author =. 2023 , note =. doi:10.1021/acs.chemmater.2c03593 , abstract =
-
[50]
Driving school for self-driving labs , volume =. Digital Discovery , author =. 2023 , note =. doi:10.1039/D3DD00150D , abstract =
-
[51]
Nature Communications , author =
Autonomous optimization of non-aqueous. Nature Communications , author =. 2022 , note =. doi:10.1038/s41467-022-32938-1 , abstract =
-
[52]
Nature Communications , author =
Differentiable modeling and optimization of non-aqueous. Nature Communications , author =. 2024 , note =. doi:10.1038/s41467-024-51653-7 , abstract =
-
[53]
Accelerating advanced-materials commercialization , volume =. Nature Materials , author =. 2016 , note =. doi:10.1038/nmat4625 , abstract =
-
[54]
doi:10.26434/chemrxiv-2025-6wq3x , abstract =
Chen, Yuhan and Lin, Hongyi and Zhang, Tianyi and Dave, Adarsh and Mitchell, Jared and Whitacre, Jay and Viswanathan, Venkatasubramanian , month = aug, year =. doi:10.26434/chemrxiv-2025-6wq3x , abstract =
-
[55]
Navigating phase diagram complexity to guide robotic inorganic materials synthesis , volume =. Nature Synthesis , author =. 2024 , note =. doi:10.1038/s44160-024-00502-y , abstract =
-
[56]
Nanoscale synthesis and affinity ranking , volume =. Nature , author =. 2018 , note =. doi:10.1038/s41586-018-0056-8 , abstract =
-
[57]
Autonomous materials synthesis via hierarchical active learning of nonequilibrium phase diagrams , volume =. Science Advances , author =. 2021 , note =. doi:10.1126/sciadv.abg4930 , abstract =
-
[58]
Real-time experiment-theory closed-loop interaction for autonomous materials science , volume =. Science Advances , author =. 2025 , note =. doi:10.1126/sciadv.adu7426 , abstract =
-
[59]
Synthesis in the. Chem , author =. 2016 , keywords =. doi:10.1016/j.chempr.2016.06.002 , abstract =
-
[60]
Highly accurate protein structure prediction with. Nature , author =. 2021 , note =. doi:10.1038/s41586-021-03819-2 , abstract =
-
[61]
Proteins: Structure, Function, and Bioinformatics , author =
Critical assessment of methods of protein structure prediction (. Proteins: Structure, Function, and Bioinformatics , author =. 2021 , note =. doi:10.1002/prot.26237 , abstract =
-
[62]
Journal of the American Chemical Society , author =
The. Journal of the American Chemical Society , author =. 2021 , note =. doi:10.1021/jacs.1c09820 , abstract =
-
[63]
Organic Process Research & Development , author =
Designing. Organic Process Research & Development , author =. 2023 , note =. doi:10.1021/acs.oprd.3c00186 , abstract =
-
[64]
Chemistry informer libraries: a chemoinformatics enabled approach to evaluate and advance synthetic methods , volume =. Chemical Science , author =. 2016 , note =. doi:10.1039/C5SC04751J , abstract =
-
[65]
Accounts of Chemical Research , author =
Chemistry. Accounts of Chemical Research , author =. 2021 , note =. doi:10.1021/acs.accounts.0c00760 , abstract =
-
[66]
Predicting reaction conditions from limited data through active transfer learning , volume =. Chemical Science , author =. 2022 , note =. doi:10.1039/D1SC06932B , abstract =
-
[67]
Nature Communications , author =
Rapid planning and analysis of high-throughput experiment arrays for reaction discovery , volume =. Nature Communications , author =. 2023 , note =. doi:10.1038/s41467-023-39531-0 , abstract =
-
[68]
The International Journal of Robotics Research , author =
Bridging the gap between safety and real-time performance in receding-horizon trajectory design for mobile robots , volume =. The International Journal of Robotics Research , author =. 2020 , note =. doi:10.1177/0278364920943266 , abstract =
-
[69]
IEEE Transactions on Robotics , author =. 2024 , keywords =. doi:10.1109/TRO.2024.3366819 , abstract =
-
[70]
IEEE Transactions on Robotics , author =
Let us. IEEE Transactions on Robotics , author =. 2025 , keywords =. doi:10.1109/TRO.2025.3584559 , abstract =
-
[71]
IEEE Transactions on Robotics , author =
Can. IEEE Transactions on Robotics , author =. 2025 , keywords =. doi:10.1109/TRO.2025.3584557 , abstract =
-
[72]
IEEE Robotics and Automation Letters , author =
Serving. IEEE Robotics and Automation Letters , author =. 2024 , keywords =. doi:10.1109/LRA.2024.3355731 , abstract =
-
[73]
IEEE Robotics and Automation Letters , author =
Bring the. IEEE Robotics and Automation Letters , author =. 2025 , keywords =. doi:10.1109/LRA.2025.3547299 , abstract =
-
[74]
Mu, Wenhao and Cao, Zhi and Uludag, Mehmed and Rodríguez, Alexander , month = sep, year =. Counterfactual. doi:10.48550/arXiv.2508.13355 , abstract =
-
[75]
Proceedings of the AAAI Conference on Artificial Intelligence , author =. 2023 , keywords =. doi:10.1609/aaai.v37i12.26690 , abstract =
-
[76]
Computers and Geotechnics , author =
Leveraging physics-informed neural networks in geotechnical earthquake engineering:. Computers and Geotechnics , author =. 2025 , keywords =. doi:10.1016/j.compgeo.2025.107137 , abstract =
-
[77]
Aditya and Raskar, Ramesh , month = may, year =
Chopra, Ayush and Rodríguez, Alexander and Subramanian, Jayakumar and Quera-Bofarull, Arnau and Krishnamurthy, Balaji and Prakash, B. Aditya and Raskar, Ramesh , month = may, year =. Differentiable. doi:10.48550/arXiv.2207.09714 , abstract =
-
[78]
Using neural networks to calibrate agent based models enables improved regional evidence for vaccine strategy and policy , volume =. Vaccine , author =. 2023 , keywords =. doi:10.1016/j.vaccine.2023.08.060 , abstract =
-
[79]
Li, Ruipu and Rodríguez, Alexander , month = dec, year =. Neural. doi:10.48550/arXiv.2412.18144 , abstract =
-
[80]
Li, Ruipu and Menacho, Daniel and Rodríguez, Alexander , month = aug, year =. Adaptive. doi:10.48550/arXiv.2508.13362 , abstract =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.