pith. machine review for the scientific record. sign in

arxiv: 2604.02688 · v2 · submitted 2026-04-03 · ❄️ cond-mat.mtrl-sci · cs.SE

Recognition: 2 theorem links

· Lean Theorem

MatClaw: An Autonomous Code-First LLM Agent for End-to-End Materials Exploration

Boris I. Yakobson, Chenmu Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-13 18:58 UTC · model grok-4.3

classification ❄️ cond-mat.mtrl-sci cs.SE
keywords LLM agentmaterials scienceautonomous workflowscode generationcomputational materialsguided autonomymachine learning force fieldsferroelectric materials
0
0 comments X

The pith

An LLM agent autonomously handles end-to-end materials simulations by writing and executing its own code.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents MatClaw as a code-first LLM agent that directly generates and runs Python code to compose materials science libraries for complex workflows on HPC clusters. Unlike previous agents limited by fixed tools, it uses a four-layer memory system to maintain coherence over long-running tasks and retrieval from source code to achieve high accuracy in API calls. Demonstrations on ferroelectric material tasks show reliable code handling but highlight the need for guidance on tacit knowledge like simulation protocols, which can be supplied through literature review and constraints. This establishes a model of guided autonomy where the agent manages execution while researchers provide domain expertise. If correct, it narrows the gap to fully autonomous computational research as LLMs improve.

Core claim

MatClaw writes and executes Python directly to orchestrate any installed domain library for multi-code workflows without predefined tool functions. A four-layer memory architecture sustains coherent execution across multi-day workflows, while retrieval-augmented generation over domain source code raises per-step accuracy to about 99 percent. End-to-end demonstrations reveal reliable code generation but struggles with tacit domain knowledge such as simulation timescales and equilibration protocols, which literature self-learning and expert constraints can bridge to enable guided autonomy.

What carries the argument

The code-first agent that dynamically writes and executes Python code, augmented by a four-layer memory architecture for long-term coherence and retrieval-augmented generation from domain source code for accurate API usage.

If this is right

  • LLMs can reliably generate and interpret scientific code for materials tasks, reducing manual coding effort.
  • Multi-day autonomous workflows become practical with proper memory management.
  • Guided autonomy, combining agent execution with human domain knowledge, accelerates materials discovery.
  • Rapid LLM improvements will outpace manual workflows in exploration speed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar code-first agents could be applied to other fields like chemistry or biology that rely on scripted simulations.
  • Open-sourcing the code allows testing on new problems to refine the approach.
  • Future versions might minimize the need for expert constraints as models learn more from data.
  • Integration with real-time experimental data could create closed-loop discovery systems.

Load-bearing premise

Tacit domain knowledge about simulation parameters and protocols can be reliably supplied by self-learning from literature and expert constraints without causing systematic errors.

What would settle it

Running the agent on a new materials problem with only minimal constraints and checking if it selects incorrect simulation timescales or sampling strategies that lead to wrong physical conclusions.

Figures

Figures reproduced from arXiv: 2604.02688 by Boris I. Yakobson, Chenmu Zhang.

Figure 1
Figure 1. Figure 1: MatClaw architecture. The researcher provides a task description in natural language. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Ferroelectric order parameter Q(T) = ⟨|η(t)|⟩ of monolayer CIPS from DeePMD MD, produced autonomously by MatClaw. Inset: side view of the CuInP2S6 monolayer structure. Open squares show the initial 60 ps sweep (last 30 ps averaged); filled circles show the final data after extending near-transition temperatures to 100 ps. The dashed line marks the estimated Tc = 261 K. Error bars are block-averaged standar… view at source ↗
Figure 3
Figure 3. Figure 3: Agent-driven heuristic search through (E, T) parameter space. Each point represents one E-field MD simulation on a 1×25×1 CIPS supercell (500 atoms). Color indicates the domino metric (slope of ⟨|∆t(d)|⟩ vs. site separation d). Gray crosses mark conditions where fewer than 30% of Cu sites flipped. The blue-circled point (Ez = −0.16 V/Å, T = 50 K, slope = 0.32 ps/site) is the best condition found. The dotte… view at source ↗
Figure 4
Figure 4. Figure 4: Domain wall propagation at the optimal condition ( [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Chunking method comparison on pymatgen code QA (300 questions, Gemini 3.0 Flash, [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Pymatgen code QA accuracy (300 questions) across five LLMs, with and without RAG. [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: QA accuracy across three domain libraries (Gemini 3.0 Flash). Without RAG, accuracy [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
read the original abstract

Existing LLM agents for computational materials science are constrained by pipeline-bounded architectures tied to specific simulation codes and by dependence on manually written tool functions that grow with task scope. We present MatClaw, a code-first agent that writes and executes Python directly, composing any installed domain library to orchestrate multi-code workflows on remote HPC clusters without predefined tool functions. To sustain coherent execution across multi-day workflows, MatClaw uses a four-layer memory architecture that prevents progressive context loss, and retrieval-augmented generation over domain source code that raises per-step API-call accuracy to ${\sim}$99 %. Three end-to-end demonstrations on ferroelectric CuInP2S6 (machine-learning force field training via active learning, Curie temperature prediction, and heuristic parameter-space search) reveal that the agent handles code generation reliably but struggles with tacit domain knowledge. The missing knowledge, such as appropriate simulation timescales, equilibration protocols, and sampling strategies, is the kind that researchers accumulate through experience but rarely formalize. Two lightweight interventions, literature self-learning and expert-specified constraints, bridge these gaps, defining a guided autonomy model in which the researcher provides high-level domain knowledge while the agent handles workflow execution. Our results demonstrate that the gap between guided and fully autonomous computational materials research is narrower than ever before: LLMs already handle code generation and scientific interpretation reliably, and the rapid improvement in their capabilities will accelerate materials discovery beyond what manual workflows can achieve. All code and benchmarks are open-source.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents MatClaw, a code-first LLM agent that directly writes and executes Python code to orchestrate multi-code materials workflows on HPC clusters without predefined tool functions. It introduces a four-layer memory architecture to maintain coherence over multi-day runs and retrieval-augmented generation over domain source code to achieve ~99% per-step API accuracy. Three end-to-end demonstrations on ferroelectric CuInP2S6 (active-learning ML force-field training, Curie-temperature prediction, and heuristic parameter-space search) show reliable code generation, but the agent requires two lightweight interventions—literature self-learning and expert-specified constraints—to address gaps in tacit domain knowledge such as simulation timescales and equilibration protocols. The work concludes that guided autonomy already narrows the gap to fully autonomous computational materials research and releases all code and benchmarks as open source.

Significance. If the guided-autonomy model can be shown to require only minimal, non-systematic expert input, the approach would meaningfully lower the barrier to complex multi-code materials simulations by shifting researcher effort from scripting to high-level guidance. The open-source release and concrete demonstrations on a real ferroelectric material provide a practical foundation for further development in the field.

major comments (3)
  1. [CuInP2S6 demonstrations] In the CuInP2S6 demonstrations section: the central claim that the two interventions are 'lightweight' and that guided autonomy already narrows the gap to full autonomy is load-bearing, yet the manuscript supplies no counts of constraints or retrieval steps injected per workflow, no failure rates before versus after intervention, and no comparison of total human oversight hours against a manual baseline. Without these data the reliability of the autonomy claim cannot be assessed.
  2. [four-layer memory architecture] Description of the four-layer memory architecture: the architecture is presented as essential for preventing progressive context loss across multi-day workflows, but no ablation study or quantitative comparison against simpler memory mechanisms (e.g., standard conversation history or vector-store retrieval alone) is provided to establish its necessity or performance gain.
  3. [RAG over domain source code] RAG evaluation: the statement that retrieval-augmented generation over domain source code raises per-step API-call accuracy to ~99% is central to the code-first reliability claim, but the manuscript does not report the size or composition of the test set, the definition of 'API-call accuracy,' or the distribution of error types, preventing independent assessment of generalizability.
minor comments (2)
  1. [Abstract] The abstract contains several long, compound sentences that would benefit from splitting to improve readability.
  2. [four-layer memory architecture] Notation for the four-layer memory components is introduced without an accompanying diagram or explicit pseudocode, making the architecture harder to follow on first reading.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which help us strengthen the quantitative aspects of our claims on guided autonomy and the technical components of MatClaw. We address each major point below and will incorporate revisions accordingly.

read point-by-point responses
  1. Referee: In the CuInP2S6 demonstrations section: the central claim that the two interventions are 'lightweight' and that guided autonomy already narrows the gap to full autonomy is load-bearing, yet the manuscript supplies no counts of constraints or retrieval steps injected per workflow, no failure rates before versus after intervention, and no comparison of total human oversight hours against a manual baseline. Without these data the reliability of the autonomy claim cannot be assessed.

    Authors: We agree that additional quantitative data would better support the claim that the interventions are lightweight. In the revised manuscript, we will add a dedicated subsection or table detailing the number of constraints and retrieval steps per demonstration workflow, pre- and post-intervention failure rates, and an estimate of human oversight hours relative to a manual baseline. This will provide a clearer assessment of the guided autonomy model's efficiency. revision: yes

  2. Referee: Description of the four-layer memory architecture: the architecture is presented as essential for preventing progressive context loss across multi-day workflows, but no ablation study or quantitative comparison against simpler memory mechanisms (e.g., standard conversation history or vector-store retrieval alone) is provided to establish its necessity or performance gain.

    Authors: We recognize the value of an ablation study to demonstrate the necessity of the four-layer memory architecture. We will include such an analysis in the revised paper, comparing the full architecture to simpler alternatives like standard conversation history and vector-store retrieval, using metrics such as context retention success rate and overall workflow completion over extended simulations. revision: yes

  3. Referee: RAG evaluation: the statement that retrieval-augmented generation over domain source code raises per-step API-call accuracy to ~99% is central to the code-first reliability claim, but the manuscript does not report the size or composition of the test set, the definition of 'API-call accuracy,' or the distribution of error types, preventing independent assessment of generalizability.

    Authors: We agree that more details are needed for the RAG evaluation to allow independent assessment. The revised manuscript will specify the test set size and composition, define 'API-call accuracy' explicitly (e.g., correct invocation including function name and parameters), and provide the distribution of error types encountered during testing. revision: yes

Circularity Check

0 steps flagged

No circularity: system description with empirical demos, no derivations or fitted predictions

full rationale

The manuscript describes an LLM agent architecture (four-layer memory, RAG over source code) and reports three end-to-end workflow demonstrations on CuInP2S6. No equations, no fitted parameters, no predictions of physical quantities, and no self-citation chains that justify uniqueness theorems or ansatzes appear in the text. The central claim rests on open-source code and observed success after lightweight interventions, which are externally verifiable rather than self-referential. This matches the default expectation of a non-circular empirical systems paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on assumptions about LLM code-generation reliability when augmented with memory and retrieval, plus the effectiveness of the newly introduced architectural components; no numerical parameters are fitted to data.

axioms (1)
  • domain assumption Large language models can generate and execute correct scientific code for materials workflows when provided with retrieval-augmented access to domain source code.
    This assumption directly supports the reported ~99% per-step API accuracy and overall workflow reliability.
invented entities (1)
  • four-layer memory architecture no independent evidence
    purpose: Prevents progressive context loss during multi-day autonomous workflows
    New component introduced to maintain coherence across long-running agent executions.

pith-pipeline@v0.9.0 · 5567 in / 1484 out tokens · 56247 ms · 2026-05-13T18:58:38.223961+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. OpenAaaS: An Open Agent-as-a-Service Framework for Distributed Materials-Informatics Research

    cond-mat.mtrl-sci 2026-05 unverdicted novelty 6.0

    OpenAaaS is a hierarchical agent-as-a-service system that enables secure multi-agent collaboration for materials informatics by moving code to data rather than data to code.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · cited by 1 Pith paper · 5 internal anchors

  1. [1]

    Agent-based learning of materials datasets from the scientific literature

    Ansari, Mehrad and Moosavi, Seyed Mohamad. Agent-based learning of materials datasets from the scientific literature. Digital Discovery, 3 0 (12): 0 2607--2617, 2024. doi:10.1039/D4DD00252K

  2. [2]

    Autonomous chemical research with large language models

    Boiko, Daniil A., MacKnight, Robert, Kline, Ben, and Gomes, Gabe. Autonomous chemical research with large language models. Nature, 624 0 (7992): 0 570--578, 2023. doi:10.1038/s41586-023-06792-0

  3. [3]

    Bran, Sam Cox, Oliver Schilter, Carlo Baldassari, Andrew D

    Bran, Andres M., Cox, Sam, Schilter, Oliver, Baldassari, Carlo, White, Andrew D., and Schwaller, Philippe. Augmenting large language models with chemistry tools. Nature Machine Intelligence, 6 0 (5): 0 525--535, 2024. doi:10.1038/s42256-024-00832-8

  4. [4]

    code-chunk: Tree-sitter based semantic code chunking, 2025

    code-chunk contributors . code-chunk: Tree-sitter based semantic code chunking, 2025. https://github.com/nicobailon/code-chunk

  5. [5]

    and Clarke, Charles L A and Buettcher, Stefan , month = jul, year =

    Cormack, Gordon V., Clarke, Charles L. A., and Buettcher, Stefan. Reciprocal rank fusion outperforms condorcet and individual rank learning methods. In Proc. SIGIR, pages 758--759, 2009. doi:10.1145/1571941.1572114

  6. [6]

    Atomate2: Modular workflows for materials science, 2025

    Ganose, Alex, Sahasrabuddhe, Hrushikesh, et al. Atomate2: Modular workflows for materials science, 2025. URL https://chemrxiv.org/doi/full/10.26434/chemrxiv-2025-tcr5h. Digital Discovery, 2025, 4, 1944--1973

  7. [7]

    He, R. et al. Unconventional ferroelectric domain switching dynamics in CuInP _2 S _6 from first principles. Phys. Rev. B, 108: 0 024305, 2023. doi:10.1103/PhysRevB.108.024305

  8. [8]

    Context rot: How increasing input tokens impacts LLM performance, 2025

    Hong, Kelly, Troynikov, Anton, and Huber, Jeff. Context rot: How increasing input tokens impacts LLM performance, 2025. URL https://www.trychroma.com/research/context-rot. Chroma Research Technical Report

  9. [9]

    SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

    Jimenez, Carlos E., Yang, John, Wettig, Alexander, Yao, Shunyu, Pei, Kexin, Press, Ofir, and Narasimhan, Karthik. SWE-bench : Can language models resolve real-world GitHub issues?, 2024. URL http://arxiv.org/abs/2310.06770

  10. [10]

    ACON : Optimizing context compression for long-horizon LLM agents, 2025

    Kang, Minki, Chen, Wei-Ning, Han, Dongge, Inan, Huseyin A., Wutschitz, Lukas, Chen, Yanzhi, Sim, Robert, and Rajmohan, Saravan. ACON : Optimizing context compression for long-horizon LLM agents, 2025. URL https://arxiv.org/abs/2510.00615

  11. [11]

    The complexity trap: Simple observation masking is as efficient as LLM summarization for agent context management, 2025

    Lindenbauer, Tobias, Slinko, Igor, Felder, Ludwig, Bogomolov, Egor, and Zharov, Yaroslav. The complexity trap: Simple observation masking is as efficient as LLM summarization for agent context management, 2025. URL http://arxiv.org/abs/2508.21433

  12. [12]

    VASPilot : MCP -facilitated multi-agent intelligence for autonomous VASP simulations

    Liu, Jiaxuan, Zhu, Tiannian, Ye, Caiyuan, Fang, Zhong, Weng, Hongming, and Wu, Quansheng. VASPilot : MCP -facilitated multi-agent intelligence for autonomous VASP simulations. Chinese Physics B, 34 0 (11): 0 117106, 2025 a . doi:10.1088/1674-1056/ae0681

  13. [13]

    Available: https://doi.org/10.1162/tacl a 00449

    Liu, Nelson F., Lin, Kevin, Hewitt, John, Paranjape, Ashwin, Bevilacqua, Michele, Petroni, Fabio, and Liang, Percy. Lost in the middle: How language models use long contexts. Transactions of the ACL, 12: 0 157--173, 2024. doi:10.1162/tacl\_a\_00638

  14. [14]

    Liu, S. et al. MatTools : Benchmarking LLM tool-use for materials science, 2025 b . URL http://arxiv.org/abs/2505.10852. arXiv:2505.10852

  15. [15]

    Intrinsic ferroelectric switching from first principles

    Liu, Shi, Grinberg, Ilya, and Rappe, Andrew M. Intrinsic ferroelectric switching from first principles. Nature, 534 0 (7607): 0 360--363, 2016. doi:10.1038/nature18286

  16. [16]

    Chevrier, Kristin A

    Ong, Shyue Ping, Richards, William Davidson, Jain, Anubhav, Hautier, Geoffroy, Kocher, Michael, Cholia, Shreyas, Gunter, Dan, Chevrier, Vincent L., Persson, Kristin A., and Ceder, Gerbrand. Python M aterials G enomics (pymatgen): A robust, open-source P ython library for materials analysis. Computational Materials Science, 68: 0 314--319, 2013. doi:10.101...

  17. [17]

    MemGPT: Towards LLMs as Operating Systems

    Packer, Charles, Wooders, Sarah, Lin, Kevin, Fang, Vivian, Patil, Shishir G., Stoica, Ion, and Gonzalez, Joseph E. MemGPT : Towards LLM s as operating systems, 2024. URL http://arxiv.org/abs/2310.08560

  18. [18]

    Nanoscale studies of ferroelectric domain walls as pinned elastic interfaces

    Paruch, Patrycja and Guyonnet, Jill. Nanoscale studies of ferroelectric domain walls as pinned elastic interfaces. Comptes Rendus Physique, 14 0 (8): 0 667--684, 2013. doi:10.1016/j.crhy.2013.08.004

  19. [19]

    TaskWeaver: A code-first agent framework,

    Qiao, Bo, Li, Liqun, Zhang, Xu, He, Shilin, Kang, Yu, Zhang, Chaoyun, Yang, Fangkai, Dong, Hang, Zhang, Jue, Wang, Lu, Ma, Minghua, Zhao, Pu, Qin, Si, Qin, Xiaoting, Du, Chao, Xu, Yong, Lin, Qingwei, Rajmohan, Saravan, and Zhang, Dongmei. TaskWeaver : A code-first agent framework, 2024. URL http://arxiv.org/abs/2311.17541

  20. [20]

    GPQA : A graduate-level Google -proof Q&A benchmark

    Rein, David, Hou, Betty Li, Stickland, Asa Cooper, Petty, Jackson, Pang, Richard Yuanzhe, Dirani, Julien, Michael, Julian, and Bowman, Samuel R. GPQA : A graduate-level Google -proof Q&A benchmark. Proc. COLM, 2024

  21. [21]

    Jobflow: Computational workflows made simple

    Rosen, Andrew S., Gallant, Max, George, Janine, Riebesell, Janosh, Sahasrabuddhe, Hrushikesh, Shen, Jimmy-Xuan, Wen, Mingjian, Evans, Matthew L., Petretto, Guido, Waroquiers, David, Rignanese, Gian-Marco, Persson, Kristin A., Jain, Anubhav, and Ganose, Alex M. Jobflow: Computational workflows made simple. Journal of Open Source Software, 9 0 (93): 0 5995,...

  22. [22]

    Reflexion: Language Agents with Verbal Reinforcement Learning

    Shinn, Noah, Cassano, Federico, Berman, Edward, Gopinath, Ashwin, Narasimhan, Karthik, and Yao, Shunyu. Reflexion: Language agents with verbal reinforcement learning, 2023. URL http://arxiv.org/abs/2303.11366. NeurIPS 2023

  23. [23]

    Cognitive architectures for language agents, 2024

    Sumers, Theodore R., Yao, Shunyu, Narasimhan, Karthik, and Griffiths, Thomas L. Cognitive architectures for language agents, 2024. URL http://arxiv.org/abs/2309.02427

  24. [24]

    Vriza, Aikaterini, Kornu, Uma, Koneru, Aditya, Chan, Henry, and Sankaranarayanan, Subramanian K. R. S. Multi-agentic AI framework for end-to-end atomistic simulations. Digital Discovery, 5 0 (1): 0 440--452, 2026. doi:10.1039/D5DD00435G

  25. [25]

    Voyager: An Open-Ended Embodied Agent with Large Language Models

    Wang, Guanzhi, Xie, Yuqi, Jiang, Yunfan, Mandlekar, Ajay, Xiao, Chaowei, Zhu, Yuke, Fan, Linxi, and Anandkumar, Anima. Voyager: An open-ended embodied agent with large language models, 2023. URL http://arxiv.org/abs/2305.16291

  26. [26]

    DeePMD-kit : A deep learning package for many-body potential energy representation and molecular dynamics

    Wang, Han, Zhang, Linfeng, Han, Jiequn, and E, Weinan. DeePMD-kit : A deep learning package for many-body potential energy representation and molecular dynamics. Computer Physics Communications, 228: 0 178--184, 2018. doi:10.1016/j.cpc.2018.03.016

  27. [27]

    Executable code actions elicit better LLM agents,

    Wang, Xingyao, Chen, Yangyi, Yuan, Lifan, Zhang, Yizhe, Li, Yunzhu, Peng, Hao, and Ji, Heng. Executable code actions elicit better LLM agents, 2024. URL http://arxiv.org/abs/2402.01030. ICML 2024

  28. [28]

    An agentic framework for autonomous materials computation, 2025

    Xia, Zeyu, Ma, Jinzhe, Zheng, Congjie, Zhang, Shufei, Li, Yuqiang, Su, Hang, Hu, P., Zhang, Changshui, Gong, Xingao, Ouyang, Wanli, Bai, Lei, Zhou, Dongzhan, and Su, Mao. An agentic framework for autonomous materials computation, 2025. URL http://arxiv.org/abs/2512.19458. arXiv:2512.19458

  29. [29]

    Efficient streaming language models with attention sinks

    Xiao, Guangxuan, Tian, Yuandong, Chen, Beidi, Han, Song, and Lewis, Mike. Efficient streaming language models with attention sinks. Proc. ICLR, 2024

  30. [30]

    ReAct : Synergizing reasoning and acting in language models

    Yao, Shunyu, Zhao, Jeffrey, Yu, Dian, Du, Nan, Shafran, Izhak, Narasimhan, Karthik R., and Cao, Yuan. ReAct : Synergizing reasoning and acting in language models. In Proc. ICLR, 2023. URL https://openreview.net/forum?id=WE_vluYUL-X

  31. [31]

    TopoMAS : Large language model driven topological materials multiagent system, 2025 a

    Zhang, Baohua, Li, Xin, Xu, Huangchao, Jin, Zhong, Wu, Quansheng, and Li, Ce. TopoMAS : Large language model driven topological materials multiagent system, 2025 a . URL http://arxiv.org/abs/2507.04053. arXiv:2507.04053

  32. [32]

    Zhang, Y. et al. DP-GEN : A concurrent learning platform for the generation of reliable deep learning based potential energy models. Comput. Phys. Commun., 253: 0 107206, 2020

  33. [33]

    cAST : Enhancing code retrieval-augmented generation with structural chunking via abstract syntax tree, 2025 b

    Zhang, Yilin, Zhao, Xinran, Wang, Zora Zhiruo, Yang, Chenyang, Wei, Jiayi, and Wu, Tongshuang. cAST : Enhancing code retrieval-augmented generation with structural chunking via abstract syntax tree, 2025 b . URL http://arxiv.org/abs/2506.15655

  34. [34]

    Integrating machine learning and large language models to advance exploration of electrochemical reactions

    Zheng, Zhiling, Florit, Federico, Jin, Brooke, Wu, Haoyang, Li, Shih-Cheng, Nandiwale, Kakasaheb Y., Salazar, Chase A., Mustakis, Jason G., Green, William H., and Jensen, Klavs F. Integrating machine learning and large language models to advance exploration of electrochemical reactions. Angewandte Chemie International Edition, 64 0 (6): 0 e202418074, 2025...

  35. [35]

    El A gente: An autonomous agent for quantum chemistry

    Zou, Yunheng, Cheng, Austin H., Aldossary, Abdulrahman, Bai, Jiaru, Leong, Shi Xuan, Campos-Gonzalez-Angulo, Jorge Arturo, Choi, Changhyeok, Ser, Cher Tian, Tom, Gary, Wang, Andrew, Zhang, Zijian, Yakavets, Ilya, Hao, Han, Crebolder, Chris, Bernales, Varinia, and Aspuru-Guzik, Al\' a n. El A gente: An autonomous agent for quantum chemistry. Matter, 8 0 (7...

  36. [36]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...