AdsMind: A Physics-Grounded Multi-Agent System for Self-Correcting Discovery of Adsorption Configurations on Heterogeneous Catalyst Surfaces

Bowen Zhang; Edvin Fako; Junwu Chen; Lixue Cheng; Philippe Schwaller; Ryo Kuroki; Xuan Vu Nguyen; Yuyang Lou; Zongmin Zhang

arxiv: 2606.19152 · v1 · pith:QLSKFWE4new · submitted 2026-06-17 · ❄️ cond-mat.mtrl-sci · cs.AI

AdsMind: A Physics-Grounded Multi-Agent System for Self-Correcting Discovery of Adsorption Configurations on Heterogeneous Catalyst Surfaces

Zongmin Zhang , Yuyang Lou , Bowen Zhang , Junwu Chen , Ryo Kuroki , Xuan Vu Nguyen , Edvin Fako , Lixue Cheng

show 1 more author

Philippe Schwaller

This is my paper

Pith reviewed 2026-06-26 20:00 UTC · model grok-4.3

classification ❄️ cond-mat.mtrl-sci cs.AI

keywords adsorption configurationsheterogeneous catalysismulti-agent systemsmachine learning force fieldsself-correctioncatalyst surfacesconfiguration search

0 comments

The pith

A closed-loop multi-agent system uses machine-learning force field feedback to self-correct adsorption configuration searches on catalyst surfaces.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents AdsMind as a framework in which LLM-based agents propose surface-adsorbate configurations, receive structural and energetic feedback from machine-learning force field relaxations, and iteratively correct their proposals until a reliable low-energy structure is identified. This closed loop is intended to overcome the lack of physics-based correction in open-loop LLM agents and the prohibitive cost of exhaustive ab initio searches. If the method works as described, it supplies reliable adsorption geometries for heterogeneous catalysis modeling while using roughly one-fourteenth the number of relaxations required by heuristic enumeration.

Core claim

AdsMind is a closed-loop multi-agent framework that enables autonomous error correction through MLFF relaxation feedback. Across four LLM backends it achieves success rates of 100 percent and 98.8 percent on the AA20 and OCD-GMAE62 benchmarks while requiring only 4.11 and 4.67 MLFF relaxations per case, an approximately 14-fold reduction over heuristic baselines. DFT validation on six AA20 systems shows that open-loop outputs produce qualitative adsorption-energy sign errors for molecular adsorbates, whereas AdsMind preserves the correct sign with closer quantitative agreement.

What carries the argument

Closed-loop multi-agent architecture that feeds MLFF relaxation results back to LLM agents for iterative proposal correction.

If this is right

Energy dispersion across different LLM backends is reduced relative to the single-pass ablation.
High reliability holds across four tested LLM backends.
Closer quantitative agreement with DFT is obtained than with open-loop agent outputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The feedback loop could be adapted to other surface or molecular configurational search tasks that currently rely on single-pass LLM proposals.
The reduction in required relaxations could make systematic screening of larger or more complex catalyst surfaces computationally feasible.
Occasional insertion of direct DFT calculations inside the loop might further improve accuracy while retaining most of the speed gain.

Load-bearing premise

The machine-learning force field relaxation supplies sufficiently accurate structural and energetic feedback that the LLM agents can reliably interpret to correct erroneous proposals.

What would settle it

A benchmark case in which the closed-loop process converges to a configuration whose DFT-computed energy is higher than the true minimum identified by exhaustive enumeration.

Figures

Figures reproduced from arXiv: 2606.19152 by Bowen Zhang, Edvin Fako, Junwu Chen, Lixue Cheng, Philippe Schwaller, Ryo Kuroki, Xuan Vu Nguyen, Yuyang Lou, Zongmin Zhang.

**Figure 2.** Figure 2: compares the reported Adsorb-Agent values and AdsMind outputs with the 14 [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3 [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4 [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5 [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗

read the original abstract

Identifying the lowest-energy surface-adsorbate configuration is critical for modeling heterogeneous catalysis, yet exhaustive exploration with ab initio calculations is computationally prohibitive. Machine-learning force fields (MLFFs) accelerate structural relaxation but leave the search over the vast configurational space a major bottleneck, and open-loop large language model (LLM) agents lack a physics-grounded feedback mechanism to correct erroneous initial guesses. We propose AdsMind (Adsorption configuration discovery with Machine intelligence and relaxation feedback), a closed-loop multi-agent framework that enables autonomous error correction through MLFF relaxation feedback. Across four LLM backends, AdsMind achieves consistently high search reliability, with success rates of 100% and 98.8% on the benchmarks AA20 and OCD-GMAE62. Relative to its single-pass (1-Shot) ablation it reduces cross-backend energy dispersion, and it uses only 4.11 and 4.67 MLFF relaxations per case, respectively -- an approximately 14-fold reduction over heuristic enumeration baselines. Density functional theory (DFT) validation using VASP/PBE on six representative AA20 systems shows that the reported open-loop Adsorb-Agent outputs exhibit qualitative adsorption-energy sign errors for molecular adsorbates, whereas AdsMind preserves the correct sign in all tested cases with closer quantitative agreement. AdsMind thus delivers reliability, self-reflection, and interpretability simultaneously, supporting more DFT-informed autonomous chemistry workflows.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AdsMind adds a closed-loop multi-agent setup with MLFF feedback to fix LLM adsorption guesses, but the DFT validation is too narrow to confirm the feedback actually drives the gains.

read the letter

The new element here is the closed-loop architecture: LLM agents propose adsorption sites, MLFFs relax them, and the results feed back for correction. That differs from the open-loop agents and pure heuristic searches they benchmark against, and it delivers the reported drop to roughly 4 relaxations per case on AA20 and OCD-GMAE62 while keeping success rates at 100% and 98.8%.

The paper does a solid job laying out the multi-agent roles, showing lower energy dispersion than the single-pass ablation, and including a small DFT spot-check on six AA20 cases where AdsMind avoids the sign errors seen in open-loop outputs. Those concrete numbers and the 14-fold reduction versus enumeration baselines are useful for anyone running catalyst screening.

The soft spot is the one the stress-test note flags. The whole self-correction story assumes the MLFF relaxations give the agents reliable enough signals to spot and fix bad proposals. Yet the only DFT cross-check is on six selected systems; nothing in the abstract or reported results shows MLFF-versus-DFT energy or force errors across the full 82 cases. If the MLFF has adsorbate- or surface-specific biases, the high success rates and sign preservation could partly reflect the surrogate rather than genuine correction. No error bars or full benchmark definitions are given either.

This is aimed at computational chemists working on automated adsorption searches or LLM agents in materials. A reader who wants practical benchmarks on multi-agent chemistry tools will find usable numbers here.

It should go to peer review. The idea addresses a real bottleneck with a workable loop, and the empirical comparisons are clear enough to merit referee scrutiny even with the validation gap.

Referee Report

2 major / 2 minor

Summary. The paper introduces AdsMind, a closed-loop multi-agent LLM framework that incorporates MLFF relaxation feedback for autonomous discovery and self-correction of adsorption configurations on heterogeneous catalyst surfaces. It reports 100% and 98.8% success rates on the AA20 and OCD-GMAE62 benchmarks respectively, with an average of 4.11–4.67 MLFF relaxations per case (14-fold reduction vs. heuristic baselines), reduced energy dispersion relative to single-pass ablation, and superior DFT sign preservation and quantitative agreement versus open-loop agents on a small validation subset.

Significance. If the central claims hold under broader validation, the work would be significant for autonomous catalysis workflows: it demonstrates a practical route to combine LLM reasoning with surrogate physics feedback to achieve high reliability at low computational cost, addressing a key bottleneck in exhaustive configurational search. The explicit comparison to open-loop baselines and the emphasis on self-correction are strengths that could support more DFT-informed discovery pipelines.

major comments (2)

[Abstract] Abstract and Results (benchmark evaluation): Success rates of 100% (AA20) and 98.8% (OCD-GMAE62) and the ~4.5 MLFF relaxations per case are defined entirely with respect to MLFF-relaxed energies and structures. No table or section reports MLFF-vs-DFT energy differences, force errors, or sign-error statistics across the full 20+62 cases; the only DFT evidence is on six hand-selected AA20 systems. This leaves the load-bearing premise—that MLFF feedback is sufficiently accurate for reliable agent self-correction—unverified for the reported benchmarks.
[Abstract] Abstract: The claim that AdsMind 'preserves the correct sign in all tested cases with closer quantitative agreement' rests on the six-system DFT subset. Without a systematic cross-check (e.g., parity plots or error distributions) on the full benchmark sets or at least a representative sample stratified by adsorbate type and surface termination, it is not possible to rule out that the reported advantage is an artifact of the particular MLFF surrogate rather than genuine physics-grounded correction.

minor comments (2)

[Abstract] Abstract states results 'across four LLM backends' but provides no per-backend breakdown of success rates, relaxation counts, or energy dispersion; a supplementary table would improve reproducibility.
No error bars or uncertainty estimates accompany the reported success rates or average relaxation counts, making it difficult to assess robustness across random seeds or LLM sampling variations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. The comments correctly identify that the reported benchmark success rates rely on the MLFF surrogate and that DFT validation is limited to a six-system subset. We address both points below and will revise the manuscript to improve clarity on these limitations while preserving the core claims.

read point-by-point responses

Referee: [Abstract] Abstract and Results (benchmark evaluation): Success rates of 100% (AA20) and 98.8% (OCD-GMAE62) and the ~4.5 MLFF relaxations per case are defined entirely with respect to MLFF-relaxed energies and structures. No table or section reports MLFF-vs-DFT energy differences, force errors, or sign-error statistics across the full 20+62 cases; the only DFT evidence is on six hand-selected AA20 systems. This leaves the load-bearing premise—that MLFF feedback is sufficiently accurate for reliable agent self-correction—unverified for the reported benchmarks.

Authors: We agree that the benchmark success rates and relaxation counts are defined with respect to MLFF-relaxed structures, as exhaustive DFT evaluation of all 82 cases is computationally prohibitive. The MLFF used is a catalysis-specific model previously validated against DFT. In the revised manuscript we will add a new Results subsection reporting MLFF-vs-DFT energy and force errors on the six validated systems, and we will explicitly state in the abstract and methods that primary metrics are MLFF-based with DFT validation on a representative subset. This revision will better contextualize the reliability of the feedback loop. revision: yes
Referee: [Abstract] Abstract: The claim that AdsMind 'preserves the correct sign in all tested cases with closer quantitative agreement' rests on the six-system DFT subset. Without a systematic cross-check (e.g., parity plots or error distributions) on the full benchmark sets or at least a representative sample stratified by adsorbate type and surface termination, it is not possible to rule out that the reported advantage is an artifact of the particular MLFF surrogate rather than genuine physics-grounded correction.

Authors: The six AA20 systems were chosen to span molecular and atomic adsorbates as well as different surface terminations. We acknowledge that parity plots and explicit stratification would strengthen the evidence. In the revision we will add parity plots and error distributions for these systems, describe the selection criteria, and include a brief discussion of surrogate limitations. We maintain that the observed sign preservation supports the benefit of closed-loop correction, but we will clarify the limited scope of the DFT comparison. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark outcomes with no self-referential derivation

full rationale

The paper presents AdsMind as an empirical multi-agent framework evaluated on fixed external benchmarks (AA20, OCD-GMAE62) via reported success rates, relaxation counts, and limited DFT checks on six hand-selected cases. No equations, fitted parameters, or self-citations are used to derive the central performance claims; the metrics are direct measurements against independent baselines and DFT. The MLFF-feedback assumption is an unverified premise for the method's reliability but does not create a definitional or fitted-input loop within the reported results. This matches the default expectation of a non-circular empirical systems paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The contribution is an integration framework built on existing MLFF and LLM components; no new physical constants, particles, or ad-hoc fitted parameters are introduced in the abstract.

axioms (1)

domain assumption MLFF relaxations supply reliable enough structural and energetic signals for LLM agents to perform effective error correction
This assumption underpins the closed-loop mechanism and is invoked when the abstract states that feedback enables autonomous error correction.

pith-pipeline@v0.9.1-grok · 5819 in / 1368 out tokens · 35013 ms · 2026-06-26T20:00:14.653079+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

60 extracted references · 1 canonical work pages

[1]

J. K. Nørskov, F. Abild-Pedersen, F. Studt, and T. Bligaard, Density functional theory in surface chemistry and catalysis, Proc. Natl. Acad. Sci. USA108, 937 (2011)

2011
[2]

Z. W. Ulissi, A. J. Medford, T. Bligaard, and J. K. Nørskov, To address surface reaction net- work complexity using scaling relations machine learning and dft calculations, Nat. Commun. 8, 14621 (2017)

2017
[3]

Andersen and K

M. Andersen and K. Reuter, Adsorption enthalpies for catalysis modeling through machine- learned descriptors, Acc. Chem. Res.54, 2741 (2021)

2021
[4]

Greeley, Theoretical heterogeneous catalysis: Scaling relationships and computational cat- alyst design, Annu

J. Greeley, Theoretical heterogeneous catalysis: Scaling relationships and computational cat- alyst design, Annu. Rev. Chem. Biomol. Eng.7, 605 (2016)

2016
[5]

B. C. Yeo, H. Nam, H. Nam, M.-C. Kim, H. W. Lee, S.-C. Kim, S. O. Won, D. Kim, K.-Y. Lee, S. Y. Lee, and S. S. Han, High-throughput computational–experimental screening protocol for the discovery of bimetallic catalysts, npj Comput. Mater.7, 137 (2021)

2021
[6]

A. S. Rosen, J. M. Notestein, and R. Q. Snurr, Identifying promising metal–organic frameworks for heterogeneous catalysis via high-throughput periodic density functional theory, J. Comput. Chem.40, 1305 (2019)

2019
[7]

Deshpande, T

S. Deshpande, T. Maxson, and J. Greeley, Graph theory approach to determine configurations of multidentate and high coverage adsorbates for heterogeneous catalysis, npj Comput. Mater. 6, 79 (2020). 30

2020
[8]

P. G. Ghanekar, S. Deshpande, and J. Greeley, Adsorbate chemical environment-based ma- chine learning framework for heterogeneous catalysis, Nat. Commun.13, 5788 (2022)

2022
[9]

L. B. Vilhelmsen and B. Hammer, A genetic algorithm for first principles global structure optimization of supported nano structures, J. Chem. Phys.141, 044711 (2014)

2014
[10]

D. J. Wales and J. P. K. Doye, Global optimization by basin-hopping and the lowest energy structures of Lennard-Jones clusters containing up to 110 atoms, J. Phys. Chem. A101, 5111 (1997)

1997
[11]

C. J. Pickard and R. J. Needs,Ab initiorandom structure searching, J. Phys.: Condens. Matter23, 053201 (2011)

2011
[12]

Stamatakis and D

M. Stamatakis and D. G. Vlachos, A graph-theoretical kinetic monte carlo framework for on-lattice chemical kinetics, J. Chem. Phys.134, 214115 (2011)

2011
[13]

Batatia, D

I. Batatia, D. P. Kov´ acs, G. N. C. Simm, C. Ortner, and G. Cs´ anyi, MACE: Higher order equivariant message passing neural networks for fast and accurate force fields, Adv. Neural Inf. Process. Syst.35(2022)

2022
[14]

Batatia, P

I. Batatia, P. Benner, Y. Chiang, A. M. Elena, D. P. Kov´ acs, J. Riebesell, X. R. Advincula, M. Asta, M. Avaylon, W. J. Baldwin, F. Berger, N. Bernstein, A. Bhowmik, F. Bigi, S. M. Blau, V. C˘ arare, M. Ceriotti, S. Chong, J. P. Darby, S. De, F. Della Pia, V. L. Deringer, R. Elijoˇ sius, Z. El-Machachi, E. Fako, F. Falcioni, A. C. Ferrari, J. L. A. Gardn...

2025
[15]

B. Deng, P. Zhong, K. Jun, J. Riebesell, K. Han, C. J. Bartel, and G. Ceder, CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling, Nat. Mach. Intell.5, 1031 (2023). 31

2023
[16]

Y.-L. Liao, B. Wood, A. Das, and T. Smidt, EquiformerV2: Improved equivariant trans- former for scaling to higher-degree representations, inInternational Conference on Learning Representations(2024)

2024
[17]

J. Lan, A. Palizhati, M. Shuaibi, B. M. Wood, B. Wander, A. Das, M. Uyttendaele, C. L. Zitnick, and Z. W. Ulissi, AdsorbML: a leap in efficiency for adsorption energy calculations using generalizable machine learning potentials, npj Comput. Mater.9, 172 (2023)

2023
[18]

Chanussot, A

L. Chanussot, A. Das, S. Goyal, T. Lavril, M. Shuaibi, M. Riviere, K. Tran, J. Heras-Domingo, C. Ho, W. Hu,et al., Open catalyst 2020 (oc20) dataset and community challenges, ACS Catal. 11, 6059 (2021)

2020
[19]

R. Tran, J. Lan, M. Shuaibi, B. M. Wood, S. Goyal, A. Das, J. Heras-Domingo, A. Kolluru, A. Rizvi, N. Shoghi,et al., The open catalyst 2022 (oc22) dataset and challenges for oxide electrocatalysts, ACS Catal.13, 3066 (2023)

2022
[20]

Reiser, M

P. Reiser, M. Neubert, A. Eberhard, L. Torresi, C. Zhou, C. Shao, H. Metni, C. van Hoesel, H. Schopmans, T. Sommer, and P. Friederich, Graph neural networks for materials science and chemistry, Commun. Mater.3, 93 (2022)

2022
[21]

J. Chen, X. Huang, C. Hua, Y. He, and P. Schwaller, A multi-modal transformer for predicting global minimum adsorption energy, Nature Communications16, 3232 (2025)

2025
[22]

A. J. Chowdhury, W. Yang, E. Walker, O. Mamun, A. Heyden, and G. A. Terejanu, Prediction of adsorption energies for chemical species on metal catalyst surfaces using machine learning, J. Phys. Chem. C122, 28142 (2018)

2018
[23]

T. Xie, X. Fu, O.-E. Ganea, R. Barzilay, and T. Jaakkola, Crystal diffusion variational au- toencoder for periodic material generation, inInternational Conference on Learning Repre- sentations(2022)

2022
[24]

C. Zeni, R. Pinsler, D. Z¨ ugner, A. Fowler, M. Horton, X. Fu, Z. Wang, A. Shysheya, J. Crabb´ e, S. Ueda, R. Sordillo, L. Sun, J. Smith, B. Nguyen, H. Schulz, S. Lewis, C.-W. Huang, Z. Lu, Y. Zhou, H. Yang, H. Hao, J. Li, C. Yang, W. Li, R. Tomioka, and T. Xie, A generative model for inorganic materials design, Nature639, 624 (2025)

2025
[25]

Abolhasani and E

M. Abolhasani and E. Kumacheva, The rise of self-driving labs in chemical and materials sciences, Nat. Synth.2, 483 (2023)

2023
[26]

J. Yang, X. Zhang, X. Zhang, B. Niu, F. Wu, N. Luo, J. He, C. Wang, B. Shan, and Q. Li, Stable adsorption configuration searching in hetero-catalysis based on similar distribution and 32 active learning, J. Catal.443, 115971 (2025)

2025
[27]

Merchant, S

A. Merchant, S. Batzner, S. S. Schoenholz, M. Aykol, G. Cheon, and E. D. Cubuk, Scaling deep learning for materials discovery, Nature624, 80 (2023)

2023
[28]

Mirza, N

A. Mirza, N. Alampara, S. Kunchapu, M. R´ ıos-Garc´ ıa, B. Emoekabu, A. Krishnan, T. Gupta, M. Schilling-Wilhelmi, M. Okereke, A. Aneesh, M. Asgari, J. Eberhardt, A. M. Elahi, H. M. Elbeheiry, M. V. Gil, C. Glaubitz, M. Greiner, C. T. Holick, T. Hoffmann, A. Ibrahim, L. C. Klepsch, Y. K¨ oster, F. A. Kreth, J. Meyer, S. Miret, J. M. Peschel, M. Ringleb, N...

2025
[29]

A. M. Bran, S. Cox, O. Schilter, C. Baldassari, A. D. White, and P. Schwaller, Augmenting large language models with chemistry tools, Nat. Mach. Intell.6, 525 (2024)

2024
[30]

K. M. Jablonka, Q. Ai, A. Al-Feghali, S. Badhwar, J. D. Bocarsly, A. M. Bran, S. Bringuier, L. C. Brinson, K. Choudhary, D. Circi, S. Cox, W. A. de Jong, M. L. Evans, N. Gastellu, J. Genzling, M. V. Gil, A. K. Gupta, Z. Hong, A. Imran, S. Kruschwitz, A. Labarre, J. L´ ala, T. Liu, S. Ma, S. Majumdar, G. W. Merz, N. Moitessier, E. Moubarak, B. Mouri˜ no, B...

2023
[31]

M. C. Ramos, C. J. Collison, and A. D. White, A review of large language models and autonomous agents in chemistry, Chem. Sci.16, 2514 (2025)

2025
[32]

Jiang, W

X. Jiang, W. Wang, S. Tian, H. Wang, T. Lookman, and Y. Su, Applications of natural language processing and large language models in materials discovery, npj Comput. Mater. 11, 79 (2025)

2025
[33]

Miret and N

S. Miret and N. M. A. Krishnan, Enabling large language models for real-world materials discovery, Nat. Mach. Intell.7, 991 (2025)

2025
[34]

N. J. Szymanski, B. Rendy, Y. Fei, R. E. Kumar, T. He, D. Milsted, M. J. McDermott, M. Gallant, E. D. Cubuk, A. Merchant, H. Kim, A. Jain, C. J. Bartel, K. Persson, Y. Zeng, and G. Ceder, An autonomous laboratory for the accelerated synthesis of inorganic materials, 33 Nature624, 86 (2023)

2023
[35]

D. A. Boiko, R. MacKnight, B. Kline, and G. Gomes, Autonomous chemical research with large language models, Nature624, 570 (2023)

2023
[36]

Zimmermann, A

Y. Zimmermann, A. Bazgir, A. Al-Feghali, M. Ansari, J. Bocarsly, L. C. Brinson, Y. Chi- ang, D. Circi, M.-H. Chiu, N. Daelman, M. L. Evans, A. S. Gangan, J. George, H. Harb, G. Khalighinejad, S. T. Khan, S. Klawohn, M. Lederbauer, S. Mahjoubi, B. Mohr, S. M. Moosavi, A. Naik, A. B. Ozhan, D. Plessers, A. Roy, F. Sch¨ oppach, P. Schwaller, C. Ter- boven, K...

2025
[37]

Yang and J

F. Yang and J. D. Evans, QUASAR: A universal autonomous system for atomistic simulation and a benchmark of its capabilities, J. Chem. Inf. Model.66, 5911 (2026)

2026
[38]

I. A. Stewart, T. P. Hage, Y.-C. Hsu, and M. J. Buehler, Graphagents: Knowledge graph- guided agentic AI for cross-domain materials design (2026), arXiv:2602.07491 [cond-mat.mtrl- sci]

arXiv 2026
[39]

Chandrasekhar, J

A. Chandrasekhar, J. Ock, and A. Barati Farimani, Catalyst-Agent: Autonomous hetero- geneous catalyst screening and optimization with an LLM agent (2026), arXiv:2603.01311 [cond-mat.mtrl-sci]

Pith/arXiv arXiv 2026
[40]

J. Wei, Y. Yang, X. Zhang, Y. Chen, X. Zhuang, Z. Gao, D. Zhou, G. Wang, Z. Gao, J. Cao, Z. Qiu, M. Hu, C. Ma, S. Tang, J. He, C. Song, X. He, Q. Zhang, C. You, S. Zheng, N. Ding, W. Ouyang, N. Dong, Y. Cheng, S. Sun, L. Bai, and B. Zhou, From AI for science to agentic science: A survey on autonomous scientific discovery (2025), arXiv:2508.14111 [cs.AI]

arXiv 2025
[41]

J. Ock, R. S. Meda, T. Vinchurkar, Y. Jadhav, and A. Barati Farimani, Adsorb-agent: au- tonomous identification of stable adsorption configurations via a large language model agent, Digital Discovery5, 617 (2026)

2026
[42]

Settles,Active Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning (Morgan & Claypool Publishers, 2012)

B. Settles,Active Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning (Morgan & Claypool Publishers, 2012)

2012
[43]

Snoek, H

J. Snoek, H. Larochelle, and R. P. Adams, Practical bayesian optimization of machine learning algorithms, inAdvances in Neural Information Processing Systems, Vol. 25 (2012)

2012
[44]

A. Roy, K. Shen, A. MacBride, A. Oladipupo, M. Taskeen, W. Treyde, R. A. E. A. Abakar, A. D. Abbas, E. Abdelfatah, A. A. Abdullahi, S. S. Abyah, C. R. Adjmi, F. Agbere, S. Ag- 34 garwal, M. Ahmed, T. Ahmed, M. Ajlouni, M. Akke, H. AlAdwan, A. S. Alazani, Z. A. Alharbi, W. A. Aljulyhi, M. A. AlKubaish, F. A. Almahri, S. A. Almohri, D. O. Alobo, M. Alouni, ...

Pith/arXiv arXiv 2025
[45]

Kresse and J

G. Kresse and J. Furthm¨ uller, Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set, Phys. Rev. B54, 11169 (1996)

1996
[46]

J. P. Perdew, K. Burke, and M. Ernzerhof, Generalized gradient approximation made simple, Phys. Rev. Lett.77, 3865 (1996)

1996
[47]

Nocedal and S

J. Nocedal and S. J. Wright,Numerical optimization, 2nd ed. (Springer, 2006)

2006
[48]

DeepMind, Gemini 2.5 pro model card,https://ai.google.dev/gemini-api/docs/ models/gemini(2026), accessed 2026-05

G. DeepMind, Gemini 2.5 pro model card,https://ai.google.dev/gemini-api/docs/ models/gemini(2026), accessed 2026-05

2026
[49]

OpenAI, Gpt-5.4 model card,https://platform.openai.com/docs/models(2026), accessed 2026-05

2026
[50]

Anthropic, Claude sonnet 4.6 model card,https://docs.anthropic.com/en/docs/ about-claude/models(2026), accessed 2026-05

2026
[51]

xAI, Grok 4,https://x.ai/blog/grok-4(2025), accessed 2026-05. 36

2025
[52]

Fako and S

E. Fako and S. De, Simple heuristics for advanced sampling of reactive species on surfaces, ACS Catal.16, 3149 (2026)

2026
[53]

Kresse and D

G. Kresse and D. Joubert, From ultrasoft pseudopotentials to the projector augmented-wave method, Phys. Rev. B59, 1758 (1999)

1999
[54]

V. Wang, N. Xu, J.-C. Liu, G. Tang, and W.-T. Geng, Vaspkit: A user-friendly interface fa- cilitating high-throughput computing and analysis using vasp code, Comput. Phys. Commun. 267, 108033 (2021)

2021
[55]

Methfessel and A

M. Methfessel and A. T. Paxton, High-precision sampling for brillouin-zone integration in metals, Phys. Rev. B40, 3616 (1989)

1989
[56]

H. J. Monkhorst and J. D. Pack, Special points for brillouin-zone integrations, Phys. Rev. B 13, 5188 (1976)

1976
[57]

PDDLgenerators.https://doi.org/10.5281/zenodo

J. Ock, Adsorb-agent open-source implementation,https://github.com/hoon-ock/ CatalystAIgent(2025), accessed 2026-04; archived athttps://doi.org/10.5281/zenodo. 17585022

work page doi:10.5281/zenodo 2025
[58]

J. Zhou, X. Chen, M. Guo, W. Hu, B. Huang, and D. Yuan, Enhanced catalytic activity of bimetallic ordered catalysts for nitrogen reduction reaction by perturbation of scaling relations, ACS Catal.13, 2190 (2023)

2023
[59]

Cheng, S

N. Cheng, S. Stambula, D. Wang, M. N. Banis, J. Liu, A. Riese, B. Xiao, R. Li, T.-K. Sham, L.-M. Liu, G. A. Botton, and X. Sun, Platinum single-atom and cluster catalysis of the hydrogen evolution reaction, Nat. Commun.7, 13638 (2016)

2016
[60]

Casillas-Trujillo, A

L. Casillas-Trujillo, A. S. Parackal, R. Armiento, and B. Alling, Evaluating and improving the predictive accuracy of mixing enthalpies and volumes in disordered alloys from universal pretrained machine learning potentials, Phys. Rev. Mater.8, 113803 (2024). 37

2024

[1] [1]

J. K. Nørskov, F. Abild-Pedersen, F. Studt, and T. Bligaard, Density functional theory in surface chemistry and catalysis, Proc. Natl. Acad. Sci. USA108, 937 (2011)

2011

[2] [2]

Z. W. Ulissi, A. J. Medford, T. Bligaard, and J. K. Nørskov, To address surface reaction net- work complexity using scaling relations machine learning and dft calculations, Nat. Commun. 8, 14621 (2017)

2017

[3] [3]

Andersen and K

M. Andersen and K. Reuter, Adsorption enthalpies for catalysis modeling through machine- learned descriptors, Acc. Chem. Res.54, 2741 (2021)

2021

[4] [4]

Greeley, Theoretical heterogeneous catalysis: Scaling relationships and computational cat- alyst design, Annu

J. Greeley, Theoretical heterogeneous catalysis: Scaling relationships and computational cat- alyst design, Annu. Rev. Chem. Biomol. Eng.7, 605 (2016)

2016

[5] [5]

B. C. Yeo, H. Nam, H. Nam, M.-C. Kim, H. W. Lee, S.-C. Kim, S. O. Won, D. Kim, K.-Y. Lee, S. Y. Lee, and S. S. Han, High-throughput computational–experimental screening protocol for the discovery of bimetallic catalysts, npj Comput. Mater.7, 137 (2021)

2021

[6] [6]

A. S. Rosen, J. M. Notestein, and R. Q. Snurr, Identifying promising metal–organic frameworks for heterogeneous catalysis via high-throughput periodic density functional theory, J. Comput. Chem.40, 1305 (2019)

2019

[7] [7]

Deshpande, T

S. Deshpande, T. Maxson, and J. Greeley, Graph theory approach to determine configurations of multidentate and high coverage adsorbates for heterogeneous catalysis, npj Comput. Mater. 6, 79 (2020). 30

2020

[8] [8]

P. G. Ghanekar, S. Deshpande, and J. Greeley, Adsorbate chemical environment-based ma- chine learning framework for heterogeneous catalysis, Nat. Commun.13, 5788 (2022)

2022

[9] [9]

L. B. Vilhelmsen and B. Hammer, A genetic algorithm for first principles global structure optimization of supported nano structures, J. Chem. Phys.141, 044711 (2014)

2014

[10] [10]

D. J. Wales and J. P. K. Doye, Global optimization by basin-hopping and the lowest energy structures of Lennard-Jones clusters containing up to 110 atoms, J. Phys. Chem. A101, 5111 (1997)

1997

[11] [11]

C. J. Pickard and R. J. Needs,Ab initiorandom structure searching, J. Phys.: Condens. Matter23, 053201 (2011)

2011

[12] [12]

Stamatakis and D

M. Stamatakis and D. G. Vlachos, A graph-theoretical kinetic monte carlo framework for on-lattice chemical kinetics, J. Chem. Phys.134, 214115 (2011)

2011

[13] [13]

Batatia, D

I. Batatia, D. P. Kov´ acs, G. N. C. Simm, C. Ortner, and G. Cs´ anyi, MACE: Higher order equivariant message passing neural networks for fast and accurate force fields, Adv. Neural Inf. Process. Syst.35(2022)

2022

[14] [14]

Batatia, P

I. Batatia, P. Benner, Y. Chiang, A. M. Elena, D. P. Kov´ acs, J. Riebesell, X. R. Advincula, M. Asta, M. Avaylon, W. J. Baldwin, F. Berger, N. Bernstein, A. Bhowmik, F. Bigi, S. M. Blau, V. C˘ arare, M. Ceriotti, S. Chong, J. P. Darby, S. De, F. Della Pia, V. L. Deringer, R. Elijoˇ sius, Z. El-Machachi, E. Fako, F. Falcioni, A. C. Ferrari, J. L. A. Gardn...

2025

[15] [15]

B. Deng, P. Zhong, K. Jun, J. Riebesell, K. Han, C. J. Bartel, and G. Ceder, CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling, Nat. Mach. Intell.5, 1031 (2023). 31

2023

[16] [16]

Y.-L. Liao, B. Wood, A. Das, and T. Smidt, EquiformerV2: Improved equivariant trans- former for scaling to higher-degree representations, inInternational Conference on Learning Representations(2024)

2024

[17] [17]

J. Lan, A. Palizhati, M. Shuaibi, B. M. Wood, B. Wander, A. Das, M. Uyttendaele, C. L. Zitnick, and Z. W. Ulissi, AdsorbML: a leap in efficiency for adsorption energy calculations using generalizable machine learning potentials, npj Comput. Mater.9, 172 (2023)

2023

[18] [18]

Chanussot, A

L. Chanussot, A. Das, S. Goyal, T. Lavril, M. Shuaibi, M. Riviere, K. Tran, J. Heras-Domingo, C. Ho, W. Hu,et al., Open catalyst 2020 (oc20) dataset and community challenges, ACS Catal. 11, 6059 (2021)

2020

[19] [19]

R. Tran, J. Lan, M. Shuaibi, B. M. Wood, S. Goyal, A. Das, J. Heras-Domingo, A. Kolluru, A. Rizvi, N. Shoghi,et al., The open catalyst 2022 (oc22) dataset and challenges for oxide electrocatalysts, ACS Catal.13, 3066 (2023)

2022

[20] [20]

Reiser, M

P. Reiser, M. Neubert, A. Eberhard, L. Torresi, C. Zhou, C. Shao, H. Metni, C. van Hoesel, H. Schopmans, T. Sommer, and P. Friederich, Graph neural networks for materials science and chemistry, Commun. Mater.3, 93 (2022)

2022

[21] [21]

J. Chen, X. Huang, C. Hua, Y. He, and P. Schwaller, A multi-modal transformer for predicting global minimum adsorption energy, Nature Communications16, 3232 (2025)

2025

[22] [22]

A. J. Chowdhury, W. Yang, E. Walker, O. Mamun, A. Heyden, and G. A. Terejanu, Prediction of adsorption energies for chemical species on metal catalyst surfaces using machine learning, J. Phys. Chem. C122, 28142 (2018)

2018

[23] [23]

T. Xie, X. Fu, O.-E. Ganea, R. Barzilay, and T. Jaakkola, Crystal diffusion variational au- toencoder for periodic material generation, inInternational Conference on Learning Repre- sentations(2022)

2022

[24] [24]

C. Zeni, R. Pinsler, D. Z¨ ugner, A. Fowler, M. Horton, X. Fu, Z. Wang, A. Shysheya, J. Crabb´ e, S. Ueda, R. Sordillo, L. Sun, J. Smith, B. Nguyen, H. Schulz, S. Lewis, C.-W. Huang, Z. Lu, Y. Zhou, H. Yang, H. Hao, J. Li, C. Yang, W. Li, R. Tomioka, and T. Xie, A generative model for inorganic materials design, Nature639, 624 (2025)

2025

[25] [25]

Abolhasani and E

M. Abolhasani and E. Kumacheva, The rise of self-driving labs in chemical and materials sciences, Nat. Synth.2, 483 (2023)

2023

[26] [26]

J. Yang, X. Zhang, X. Zhang, B. Niu, F. Wu, N. Luo, J. He, C. Wang, B. Shan, and Q. Li, Stable adsorption configuration searching in hetero-catalysis based on similar distribution and 32 active learning, J. Catal.443, 115971 (2025)

2025

[27] [27]

Merchant, S

A. Merchant, S. Batzner, S. S. Schoenholz, M. Aykol, G. Cheon, and E. D. Cubuk, Scaling deep learning for materials discovery, Nature624, 80 (2023)

2023

[28] [28]

Mirza, N

A. Mirza, N. Alampara, S. Kunchapu, M. R´ ıos-Garc´ ıa, B. Emoekabu, A. Krishnan, T. Gupta, M. Schilling-Wilhelmi, M. Okereke, A. Aneesh, M. Asgari, J. Eberhardt, A. M. Elahi, H. M. Elbeheiry, M. V. Gil, C. Glaubitz, M. Greiner, C. T. Holick, T. Hoffmann, A. Ibrahim, L. C. Klepsch, Y. K¨ oster, F. A. Kreth, J. Meyer, S. Miret, J. M. Peschel, M. Ringleb, N...

2025

[29] [29]

A. M. Bran, S. Cox, O. Schilter, C. Baldassari, A. D. White, and P. Schwaller, Augmenting large language models with chemistry tools, Nat. Mach. Intell.6, 525 (2024)

2024

[30] [30]

K. M. Jablonka, Q. Ai, A. Al-Feghali, S. Badhwar, J. D. Bocarsly, A. M. Bran, S. Bringuier, L. C. Brinson, K. Choudhary, D. Circi, S. Cox, W. A. de Jong, M. L. Evans, N. Gastellu, J. Genzling, M. V. Gil, A. K. Gupta, Z. Hong, A. Imran, S. Kruschwitz, A. Labarre, J. L´ ala, T. Liu, S. Ma, S. Majumdar, G. W. Merz, N. Moitessier, E. Moubarak, B. Mouri˜ no, B...

2023

[31] [31]

M. C. Ramos, C. J. Collison, and A. D. White, A review of large language models and autonomous agents in chemistry, Chem. Sci.16, 2514 (2025)

2025

[32] [32]

Jiang, W

X. Jiang, W. Wang, S. Tian, H. Wang, T. Lookman, and Y. Su, Applications of natural language processing and large language models in materials discovery, npj Comput. Mater. 11, 79 (2025)

2025

[33] [33]

Miret and N

S. Miret and N. M. A. Krishnan, Enabling large language models for real-world materials discovery, Nat. Mach. Intell.7, 991 (2025)

2025

[34] [34]

N. J. Szymanski, B. Rendy, Y. Fei, R. E. Kumar, T. He, D. Milsted, M. J. McDermott, M. Gallant, E. D. Cubuk, A. Merchant, H. Kim, A. Jain, C. J. Bartel, K. Persson, Y. Zeng, and G. Ceder, An autonomous laboratory for the accelerated synthesis of inorganic materials, 33 Nature624, 86 (2023)

2023

[35] [35]

D. A. Boiko, R. MacKnight, B. Kline, and G. Gomes, Autonomous chemical research with large language models, Nature624, 570 (2023)

2023

[36] [36]

Zimmermann, A

Y. Zimmermann, A. Bazgir, A. Al-Feghali, M. Ansari, J. Bocarsly, L. C. Brinson, Y. Chi- ang, D. Circi, M.-H. Chiu, N. Daelman, M. L. Evans, A. S. Gangan, J. George, H. Harb, G. Khalighinejad, S. T. Khan, S. Klawohn, M. Lederbauer, S. Mahjoubi, B. Mohr, S. M. Moosavi, A. Naik, A. B. Ozhan, D. Plessers, A. Roy, F. Sch¨ oppach, P. Schwaller, C. Ter- boven, K...

2025

[37] [37]

Yang and J

F. Yang and J. D. Evans, QUASAR: A universal autonomous system for atomistic simulation and a benchmark of its capabilities, J. Chem. Inf. Model.66, 5911 (2026)

2026

[38] [38]

I. A. Stewart, T. P. Hage, Y.-C. Hsu, and M. J. Buehler, Graphagents: Knowledge graph- guided agentic AI for cross-domain materials design (2026), arXiv:2602.07491 [cond-mat.mtrl- sci]

arXiv 2026

[39] [39]

Chandrasekhar, J

A. Chandrasekhar, J. Ock, and A. Barati Farimani, Catalyst-Agent: Autonomous hetero- geneous catalyst screening and optimization with an LLM agent (2026), arXiv:2603.01311 [cond-mat.mtrl-sci]

Pith/arXiv arXiv 2026

[40] [40]

J. Wei, Y. Yang, X. Zhang, Y. Chen, X. Zhuang, Z. Gao, D. Zhou, G. Wang, Z. Gao, J. Cao, Z. Qiu, M. Hu, C. Ma, S. Tang, J. He, C. Song, X. He, Q. Zhang, C. You, S. Zheng, N. Ding, W. Ouyang, N. Dong, Y. Cheng, S. Sun, L. Bai, and B. Zhou, From AI for science to agentic science: A survey on autonomous scientific discovery (2025), arXiv:2508.14111 [cs.AI]

arXiv 2025

[41] [41]

J. Ock, R. S. Meda, T. Vinchurkar, Y. Jadhav, and A. Barati Farimani, Adsorb-agent: au- tonomous identification of stable adsorption configurations via a large language model agent, Digital Discovery5, 617 (2026)

2026

[42] [42]

Settles,Active Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning (Morgan & Claypool Publishers, 2012)

B. Settles,Active Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning (Morgan & Claypool Publishers, 2012)

2012

[43] [43]

Snoek, H

J. Snoek, H. Larochelle, and R. P. Adams, Practical bayesian optimization of machine learning algorithms, inAdvances in Neural Information Processing Systems, Vol. 25 (2012)

2012

[44] [44]

A. Roy, K. Shen, A. MacBride, A. Oladipupo, M. Taskeen, W. Treyde, R. A. E. A. Abakar, A. D. Abbas, E. Abdelfatah, A. A. Abdullahi, S. S. Abyah, C. R. Adjmi, F. Agbere, S. Ag- 34 garwal, M. Ahmed, T. Ahmed, M. Ajlouni, M. Akke, H. AlAdwan, A. S. Alazani, Z. A. Alharbi, W. A. Aljulyhi, M. A. AlKubaish, F. A. Almahri, S. A. Almohri, D. O. Alobo, M. Alouni, ...

Pith/arXiv arXiv 2025

[45] [45]

Kresse and J

G. Kresse and J. Furthm¨ uller, Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set, Phys. Rev. B54, 11169 (1996)

1996

[46] [46]

J. P. Perdew, K. Burke, and M. Ernzerhof, Generalized gradient approximation made simple, Phys. Rev. Lett.77, 3865 (1996)

1996

[47] [47]

Nocedal and S

J. Nocedal and S. J. Wright,Numerical optimization, 2nd ed. (Springer, 2006)

2006

[48] [48]

DeepMind, Gemini 2.5 pro model card,https://ai.google.dev/gemini-api/docs/ models/gemini(2026), accessed 2026-05

G. DeepMind, Gemini 2.5 pro model card,https://ai.google.dev/gemini-api/docs/ models/gemini(2026), accessed 2026-05

2026

[49] [49]

OpenAI, Gpt-5.4 model card,https://platform.openai.com/docs/models(2026), accessed 2026-05

2026

[50] [50]

Anthropic, Claude sonnet 4.6 model card,https://docs.anthropic.com/en/docs/ about-claude/models(2026), accessed 2026-05

2026

[51] [51]

xAI, Grok 4,https://x.ai/blog/grok-4(2025), accessed 2026-05. 36

2025

[52] [52]

Fako and S

E. Fako and S. De, Simple heuristics for advanced sampling of reactive species on surfaces, ACS Catal.16, 3149 (2026)

2026

[53] [53]

Kresse and D

G. Kresse and D. Joubert, From ultrasoft pseudopotentials to the projector augmented-wave method, Phys. Rev. B59, 1758 (1999)

1999

[54] [54]

V. Wang, N. Xu, J.-C. Liu, G. Tang, and W.-T. Geng, Vaspkit: A user-friendly interface fa- cilitating high-throughput computing and analysis using vasp code, Comput. Phys. Commun. 267, 108033 (2021)

2021

[55] [55]

Methfessel and A

M. Methfessel and A. T. Paxton, High-precision sampling for brillouin-zone integration in metals, Phys. Rev. B40, 3616 (1989)

1989

[56] [56]

H. J. Monkhorst and J. D. Pack, Special points for brillouin-zone integrations, Phys. Rev. B 13, 5188 (1976)

1976

[57] [57]

PDDLgenerators.https://doi.org/10.5281/zenodo

J. Ock, Adsorb-agent open-source implementation,https://github.com/hoon-ock/ CatalystAIgent(2025), accessed 2026-04; archived athttps://doi.org/10.5281/zenodo. 17585022

work page doi:10.5281/zenodo 2025

[58] [58]

J. Zhou, X. Chen, M. Guo, W. Hu, B. Huang, and D. Yuan, Enhanced catalytic activity of bimetallic ordered catalysts for nitrogen reduction reaction by perturbation of scaling relations, ACS Catal.13, 2190 (2023)

2023

[59] [59]

Cheng, S

N. Cheng, S. Stambula, D. Wang, M. N. Banis, J. Liu, A. Riese, B. Xiao, R. Li, T.-K. Sham, L.-M. Liu, G. A. Botton, and X. Sun, Platinum single-atom and cluster catalysis of the hydrogen evolution reaction, Nat. Commun.7, 13638 (2016)

2016

[60] [60]

Casillas-Trujillo, A

L. Casillas-Trujillo, A. S. Parackal, R. Armiento, and B. Alling, Evaluating and improving the predictive accuracy of mixing enthalpies and volumes in disordered alloys from universal pretrained machine learning potentials, Phys. Rev. Mater.8, 113803 (2024). 37

2024