pith. machine review for the scientific record. sign in

arxiv: 2604.15821 · v1 · submitted 2026-04-17 · 💻 cs.DC · cs.LG

Recognition: unknown

Breaking the Training Barrier of Billion-Parameter Universal Machine Learning Interatomic Potentials

Authors on Pith no claims yet

Pith reviewed 2026-05-10 07:55 UTC · model grok-4.3

classification 💻 cs.DC cs.LG
keywords billion-parameter modelsmixture of expertsinteratomic potentialsdistributed trainingexascale computingsecond-order derivativeshigh-performance computingAI for science
0
0 comments X

The pith

The Janus framework scales training of billion-parameter interatomic potential models to exascale performance, reducing time from weeks to hours.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MatRIS-MoE, a billion-parameter Mixture-of-Experts model built on an invariant architecture for universal machine learning interatomic potentials that cover materials and molecules across the periodic table. It pairs the model with the Janus distributed training framework that applies hardware-aware optimizations to handle second-order derivative computations and the resulting communication overhead. When deployed on two exascale supercomputers the system reaches 1.2 and 1.0 EFLOPS in single precision at over 90 percent parallel efficiency. This compresses training from weeks to hours. A sympathetic reader would care because the work makes large foundational models for quantum-accurate physical simulations practical to develop and use.

Core claim

MatRIS-MoE is presented as a billion-parameter Mixture-of-Experts model with invariant architecture for universal machine learning interatomic potentials. Janus is the high-dimensional distributed training framework equipped with hardware-aware optimizations that parallelize second-order derivative computations and communication. Deployed across two exascale supercomputers, the code attains 1.2/1.0 EFLOPS (24/35.5 percent of theoretical peak) in single precision at over 90 percent parallel efficiency, thereby shortening the training of billion-parameter uMLIPs from weeks to hours and supplying infrastructure for exascale AI-for-Science foundation models.

What carries the argument

Janus, the high-dimensional distributed training framework that applies hardware-aware optimizations to parallelize second-order derivative computations and communication for billion-parameter uMLIP training.

If this is right

  • Training of billion-parameter universal interatomic potentials becomes feasible in hours on existing exascale hardware rather than weeks.
  • Faster iteration is now possible when developing foundational models for quantum-accurate simulations of materials and molecules.
  • The achieved performance sets a concrete benchmark for large-scale training of scientific machine learning models.
  • Essential infrastructure is supplied for expanding the scale of universal interatomic potential training in AI-for-Science applications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The optimizations for handling derivative computations at scale could be adapted to other scientific machine learning models that rely on similar higher-order calculations.
  • Shorter training cycles may enable researchers to test larger model sizes or broader training datasets in materials and molecular science.
  • The demonstrated efficiency suggests the framework could support more frequent retraining or domain-specific fine-tuning of potentials for targeted applications.
  • Hardware-specific software approaches like this may become necessary to extract full value from future exascale and post-exascale machines in computational science.

Load-bearing premise

The hardware-aware optimizations in Janus successfully parallelize second-order derivative computations and communication without introducing numerical instability or accuracy loss in the resulting potentials.

What would settle it

A side-by-side validation on standard energy and force datasets where the billion-parameter models trained with Janus exhibit substantially higher prediction errors than smaller models trained by conventional methods, or a scaling run that fails to sustain the reported EFLOPS and efficiency without accuracy degradation.

Figures

Figures reproduced from arXiv: 2604.15821 by Chen Wang, Guangming Tan, Hongyu Wang, Jingde Bu, Long Wang, Mingzhen Li, Siyu Hu, Weijian Liu, Weile Jia, Xiangyu Zhang, Yan Wang, Yiming Du, Yuanchang Zhou, Yutong Lu, Zhuoqiang Guo.

Figure 1
Figure 1. Figure 1: Overview of the MatRIS architecture. (a) Graph construction under pe [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of our work. (a) Model architecture. (b)–(d) Framework-level optimizations, including FSDP, FSEP, and FSGP. (e)–(g) Supercomputer-level [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Execution timeline of our framework in MatRIS-MoE. Each interac [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Convergence behavior of MatRIS-MoE under different batch sizes on [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The out-of-the-box accuracy results of MatRIS-MoE on cross-domain [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Strong scaling of MatRIS-MoE training on CNIS and LineShine. [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: Applicability of MatRIS-MoE across (a) energy ranking, (b) molecular [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
read the original abstract

Universal Machine Learning Interatomic Potentials (uMLIPs), pre-trained on massively diverse datasets encompassing inorganic materials and organic molecules across the entire periodic table, serve as foundational models for quantum-accurate physical simulations. However, uMLIP training requires second-order derivatives, which lack corresponding parallel training frameworks; moreover, scaling to the billion-parameter regime causes explosive growth in computation and communication overhead, making its training a tremendous challenge. We introduce MatRIS-MoE, a billion-parameter Mixture-of-Experts model built upon invariant architecture, and {Janus}, a pioneering high-dimensional distributed training framework for uMLIPs with hardware-aware optimizations. Deployed across two Exascale supercomputers, our code attains a peak performance of 1.2/1.0 EFLOPS (24\%/{35.5\%} of theoretical peak) in single precision at over 90\% parallel efficiency, compressing the training of billion-parameter uMLIPs from weeks to hours. This work establishes a new high-water mark for AI-for-Science (AI4S) foundation models at Exascale and provides essential infrastructure for rapid scientific discovery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper introduces MatRIS-MoE, a billion-parameter Mixture-of-Experts model for universal machine learning interatomic potentials (uMLIPs) based on an invariant architecture, together with the Janus high-dimensional distributed training framework that incorporates hardware-aware optimizations for second-order derivatives and communication. It reports achieving 1.2/1.0 EFLOPS (24%/35.5% of theoretical peak) in single precision on two exascale supercomputers at >90% parallel efficiency, reducing training of billion-parameter uMLIPs from weeks to hours.

Significance. If the reported scaling and efficiency hold and the optimizations preserve model fidelity, the work would establish a new benchmark for exascale training of scientific foundation models, directly enabling faster development of quantum-accurate potentials for materials and molecular simulations.

major comments (1)
  1. Abstract: the headline performance numbers (1.2/1.0 EFLOPS, >90% efficiency, weeks-to-hours reduction) are presented without any accompanying validation that the Janus parallelization of second-order derivatives and all-reduce communication preserves numerical stability or uMLIP accuracy; no MAE/RMSE values on energies, forces or stresses versus DFT references, no ablation of optimized versus baseline runs, and no single-precision Hessian stability analysis are supplied, leaving the central claim that the framework is useful for quantum-accurate uMLIPs unverified.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and for identifying the need for explicit validation of the optimizations' effects on model accuracy and stability. We address this major comment below and commit to revisions that will incorporate the requested evidence.

read point-by-point responses
  1. Referee: Abstract: the headline performance numbers (1.2/1.0 EFLOPS, >90% efficiency, weeks-to-hours reduction) are presented without any accompanying validation that the Janus parallelization of second-order derivatives and all-reduce communication preserves numerical stability or uMLIP accuracy; no MAE/RMSE values on energies, forces or stresses versus DFT references, no ablation of optimized versus baseline runs, and no single-precision Hessian stability analysis are supplied, leaving the central claim that the framework is useful for quantum-accurate uMLIPs unverified.

    Authors: We agree that the abstract and main results emphasize scaling and efficiency without directly demonstrating that the Janus optimizations preserve uMLIP accuracy and numerical properties. The MatRIS-MoE model builds on an invariant architecture whose baseline accuracy has been established in related literature, and our parallelization is designed to be mathematically equivalent. However, the manuscript does not currently contain the requested ablations, MAE/RMSE comparisons, or single-precision Hessian analysis. In the revised version we will add: (i) MAE/RMSE values on energies, forces, and stresses versus DFT references for models trained with Janus versus a baseline implementation, (ii) an ablation study quantifying any differences in final model performance due to the second-order derivative and communication optimizations, and (iii) a stability analysis of Hessian computations in single precision. These additions will be placed in the results section and briefly referenced in a revised abstract to substantiate the claim that the framework supports quantum-accurate uMLIPs. revision: yes

Circularity Check

0 steps flagged

No circularity: performance claims rest on measured runtime and efficiency, not self-referential derivations

full rationale

This is an engineering systems paper introducing MatRIS-MoE architecture and the Janus distributed training framework. Central claims are empirical: achieved 1.2/1.0 EFLOPS at >90% parallel efficiency on exascale hardware, with training time reduced from weeks to hours. These rest on wall-clock measurements and FLOPS counters from actual deployments, not on any equation or parameter that is fitted to a subset and then renamed as a 'prediction.' No self-definitional loops, no uniqueness theorems imported from prior self-citations, and no ansatz smuggled via citation. The paper is self-contained against external benchmarks of runtime and scaling; the skeptic concern about missing accuracy numbers is a completeness issue, not circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The work rests on standard assumptions from distributed deep learning and equivariant neural networks for atomic systems; no new physical axioms or fitted constants are introduced in the abstract.

axioms (2)
  • domain assumption Invariant architectures preserve physical symmetries under rotations and translations
    Invoked to justify the base model choice for uMLIPs
  • domain assumption Hardware-aware optimizations can hide communication latency for second-order derivative tensors at exascale
    Central to the claimed efficiency of Janus
invented entities (2)
  • MatRIS-MoE no independent evidence
    purpose: Billion-parameter mixture-of-experts model for universal interatomic potentials
    New model architecture presented as the core contribution
  • Janus no independent evidence
    purpose: High-dimensional distributed training framework for uMLIPs
    New software system for exascale training

pith-pipeline@v0.9.0 · 5546 in / 1442 out tokens · 19918 ms · 2026-05-10T07:55:26.075367+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 2 canonical work pages

  1. [1]

    Electronic Properties of MoS2 Nanoparticles,

    T. Li and G. Galli, “Electronic Properties of MoS2 Nanoparticles,”The Journal of Physical Chemistry C, vol. 111, no. 44, pp. 16 192–16 196, Nov. 2007

  2. [2]

    Identification of cathode materials for lithium batteries guided by first-principles calculations,

    G. Ceder, Y .-M. Chiang, D. R. Sadoway, M. K. Aydinol, Y .-I. Jang, and B. Huang, “Identification of cathode materials for lithium batteries guided by first-principles calculations,”Nature, vol. 392, no. 6677, pp. 694–696, Apr. 1998

  3. [3]

    Progress in material design for biomedical applications,

    M. W. Tibbitt, C. B. Rodell, J. A. Burdick, and K. S. Anseth, “Progress in material design for biomedical applications,”Proceedings of the National Academy of Sciences, vol. 112, no. 47, pp. 14 444–14 451, 2015

  4. [4]

    Neuroevolution machine learning potentials: Combining high accuracy and low cost in atomistic simulations and application to heat transport,

    Z. Fan, Z. Zeng, C. Zhang, Y . Wang, K. Song, H. Dong, Y . Chen, and T. Ala-Nissila, “Neuroevolution machine learning potentials: Combining high accuracy and low cost in atomistic simulations and application to heat transport,”Physical Review B, vol. 104, no. 10, p. 104309, 2021

  5. [5]

    E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials,

    S. Batzner, A. Musaelian, L. Sun, M. Geiger, J. P. Mailoa, M. Kornbluth, N. Molinari, T. E. Smidt, and B. Kozinsky, “E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials,” Nature Communications, vol. 13, no. 1, May 2022

  6. [6]

    Pushing the limit of molecular dynamics with ab initio accuracy to 100 million atoms with machine learning,

    W. Jia, H. Wang, M. Chen, D. Lu, L. Lin, R. Car, W. E, and L. Zhang, “Pushing the limit of molecular dynamics with ab initio accuracy to 100 million atoms with machine learning,” inProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC ’20. IEEE Press, 2020. 10

  7. [7]

    UMA: A family of universal models for atoms,

    B. M. Wood, M. Dzamba, X. Fu, M. Gao, M. Shuaibi, L. Barroso- Luque, K. Abdelmaqsoud, V . Gharakhanyan, J. R. Kitchin, D. S. Levine, K. Michel, A. Sriram, T. Cohen, A. Das, S. J. Sahoo, A. Rizvi, Z. W. Ulissi, and C. L. Zitnick, “UMA: A family of universal models for atoms,” inThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  8. [8]

    Optimizing cross- domain transfer for universal machine learning interatomic potentials,

    J. Kim, J. You, Y . Park, Y . Lim, Y . Kang, J. Kim, H. Jeon, S. Ju, D. Hong, S. Y . Lee, S. Choi, Y . Kim, J. W. Lee, and S. Han, “Optimizing cross- domain transfer for universal machine learning interatomic potentials,” Nature Communications, Mar. 2026

  9. [9]

    Chgnet as a pretrained universal neural network potential for charge-informed atomistic modelling,

    B. Deng, P. Zhong, K. Jun, J. Riebesell, K. Han, C. J. Bartel, and G. Ceder, “Chgnet as a pretrained universal neural network potential for charge-informed atomistic modelling,”Nature Machine Intelligence, p. 1–11, 2023

  10. [10]

    MatRIS: Toward reliable and efficient pretrained machine learning interatomic potentials,

    Y . Zhou, S. Hu, X. Zhang, H. Wang, G. Tan, and W. Jia, “MatRIS: Toward reliable and efficient pretrained machine learning interatomic potentials,” inThe Fourteenth International Conference on Learning Representations, 2026

  11. [11]

    The open polymers 2026 (opoly26) dataset and evaluations,

    D. S. Levine, N. Liesen, L. Chua, J. Diffenderfer, H. Ingolfsson, M. P. Kroonblawd, N. Kumar, A. Maiti, S. S. Mohottalalage, M. Shuaibi, B. V . Essen, B. M. Wood, C. L. Zitnick, S. M. Blau, and E. R. Antoniuk, “The open polymers 2026 (opoly26) dataset and evaluations,” 2026

  12. [12]

    Open molecular crystals 2025 (omc25) dataset and models,

    V . Gharakhanyan, L. Barroso-Luque, Y . Yang, M. Shuaibi, K. Michel, D. S. Levine, M. Dzamba, X. Fu, M. Gao, X. Liu, H. Ni, K. Noori, B. M. Wood, M. Uyttendaele, A. Boromand, C. L. Zitnick, N. Marom, Z. W. Ulissi, and A. Sriram, “Open molecular crystals 2025 (omc25) dataset and models,” 2025

  13. [13]

    The open dac 2025 dataset for sorbent discovery in direct air capture,

    A. Sriram, L. M. Brabson, X. Yu, S. Choi, K. Abdelmaqsoud, E. Moubarak, P. de Haan, S. Löwe, J. Brehmer, J. R. Kitchin, M. Welling, C. L. Zitnick, Z. Ulissi, A. J. Medford, and D. S. Sholl, “The open dac 2025 dataset for sorbent discovery in direct air capture,” 2025

  14. [14]

    Levine, Zachary Ulissi, C

    S. J. Sahoo, M. Maraschin, D. S. Levine, Z. Ulissi, C. L. Zitnick, J. B. Varley, J. A. Gauthier, N. Govindarajan, and M. Shuaibi, “The open catalyst 2025 (oc25) dataset and models for solid-liquid interfaces,” arXiv preprint arXiv:2509.17862, 2025

  15. [15]

    Open catalyst 2020 (oc20) dataset and community challenges,

    L. Chanussot, A. Das, S. Goyal, T. Lavril, M. Shuaibi, M. Riviere, K. Tran, J. Heras-Domingo, C. Ho, W. Huet al., “Open catalyst 2020 (oc20) dataset and community challenges,”Acs Catalysis, vol. 11, no. 10, pp. 6059–6072, 2021

  16. [16]

    A foundational potential energy surface dataset for materials,

    A. D. Kaplan, R. Liu, J. Qi, T. W. Ko, B. Deng, J. Riebesell, G. Ceder, K. A. Persson, and S. P. Ong, “A foundational potential energy surface dataset for materials,” 2025

  17. [17]

    arXiv preprint arXiv:2410.12771 , year=

    L. Barroso-Luque, M. Shuaibi, X. Fu, B. M. Wood, M. Dzamba, M. Gao, A. Rizvi, C. L. Zitnick, and Z. W. Ulissi, “Open materials 2024 (omat24) inorganic materials dataset and models,”arXiv preprint arXiv:2410.12771, 2024

  18. [18]

    The open molecules 2025 (omol25) dataset, evaluations, and models,

    D. S. Levine, M. Shuaibi, E. W. C. Spotte-Smith, M. G. Taylor, M. R. Hasyim, K. Michel, I. Batatia, G. Csányi, M. Dzamba, P. Eastman, N. C. Frey, X. Fu, V . Gharakhanyan, A. S. Krishnapriyan, J. A. Rackers, S. Raja, A. Rizvi, A. S. Rosen, Z. Ulissi, S. Vargas, C. L. Zitnick, S. M. Blau, and B. M. Wood, “The open molecules 2025 (omol25) dataset, evaluation...

  19. [19]

    A universal graph deep learning interatomic potential for the periodic table,

    C. Chen and S. P. Ong, “A universal graph deep learning interatomic potential for the periodic table,”Nature Computational Science, vol. 2, no. 11, pp. 718–728, Nov. 2022

  20. [20]

    Scaling deep learning for materials discovery,

    A. Merchant, S. Batzner, S. S. Schoenholz, M. Aykol, G. Cheon, and E. D. Cubuk, “Scaling deep learning for materials discovery,”Nature, vol. 624, no. 7990, pp. 80–85, 2023

  21. [21]

    Orb-v3: atomistic simulation at scale,

    B. Rhodes, S. Vandenhaute, V . Šimkus, J. Gin, J. Godwin, T. Duignan, and M. Neumann, “Orb-v3: atomistic simulation at scale,” 2025

  22. [22]

    Pushing the limits of unconstrained machine-learned interatomic potentials,

    F. Bigi, P. Pegolo, A. Mazitov, and M. Ceriotti, “Pushing the limits of unconstrained machine-learned interatomic potentials,” 2026

  23. [23]

    Cross learning between electronic structure theories for unifying molecular, surface, and inorganic crystal foundation force fields,

    I. Batatia, C. Lin, J. Hart, E. Kasoar, A. M. Elena, S. W. Norwood, T. Wolf, and G. Csányi, “Cross learning between electronic structure theories for unifying molecular, surface, and inorganic crystal foundation force fields,” 2025

  24. [24]

    Equiformerv2: Improved equivariant transformer for scaling to higher-degree representations,

    Y .-L. Liao, B. M. Wood, A. Das, and T. Smidt, “Equiformerv2: Improved equivariant transformer for scaling to higher-degree representations,” inThe Twelfth International Conference on Learning Representations, 2024

  25. [25]

    MACE: Higher order equivariant message passing neural networks for fast and accurate force fields,

    I. Batatia, D. P. Kovacs, G. N. C. Simm, C. Ortner, and G. Csanyi, “MACE: Higher order equivariant message passing neural networks for fast and accurate force fields,” inAdvances in Neural Information Processing Systems, 2022

  26. [26]

    A graph neural network for the era of large atomistic models,

    D. Zhang, A. Peng, C. Cai, W. Li, Y . Zhou, J. Zeng, M. Guo, C. Zhang, B. Li, H. Jiang, T. Zhu, W. Jia, L. Zhang, and H. Wang, “A graph neural network for the era of large atomistic models,” 2026

  27. [27]

    Towards training billion parameter graph neural networks for atomic simulations,

    A. Sriram, A. Das, B. M. Wood, and C. L. Zitnick, “Towards training billion parameter graph neural networks for atomic simulations,” in International Conference on Learning Representations, 2022

  28. [28]

    Laer-moe: Load-adaptive expert re-layout for efficient mixture-of-experts training,

    X. Liu, Y . Wang, F. Fu, X. Xiao, H. Li, J. Li, and B. Cui, “Laer-moe: Load-adaptive expert re-layout for efficient mixture-of-experts training,” inASPLOS ’26, Volume 2, NY , USA, 2026, p. 1055–1072

  29. [29]

    Wiggle150: Benchmarking density functionals and neural network potentials on highly strained conformers,

    R. R. Brew, I. A. Nelson, M. Binayeva, A. S. Nayak, W. J. Simmons, J. J. Gair, and C. C. Wagen, “Wiggle150: Benchmarking density functionals and neural network potentials on highly strained conformers,”Journal of Chemical Theory and Computation, vol. 21, no. 8, pp. 3922–3929, 2025, pMID: 40211427

  30. [30]

    A look at the density functional theory zoo with the advanced gmtkn55 database for general main group thermochemistry, kinetics and noncova- lent interactions,

    L. Goerigk, A. Hansen, C. Bauer, S. Ehrlich, A. Najibi, and S. Grimme, “A look at the density functional theory zoo with the advanced gmtkn55 database for general main group thermochemistry, kinetics and noncova- lent interactions,”Phys. Chem. Chem. Phys., vol. 19, pp. 32 184–32 215, 2017

  31. [31]

    Understanding the role of vibrations, exact exchange, and many-body van der waals interactions in the cohesive properties of molecular crystals,

    A. M. Reilly and A. Tkatchenko, “Understanding the role of vibrations, exact exchange, and many-body van der waals interactions in the cohesive properties of molecular crystals,”The Journal of Chemical Physics, vol. 139, no. 2, p. 024705, 07 2013

  32. [32]

    Dmc-ice13: Ambient and high pressure polymorphs of ice from diffusion monte carlo and density functional theory,

    F. Della Pia, A. Zen, D. Alfè, and A. Michaelides, “Dmc-ice13: Ambient and high pressure polymorphs of ice from diffusion monte carlo and density functional theory,”The Journal of Chemical Physics, vol. 157, no. 13, p. 134701, 10 2022

  33. [33]

    Cattsunami: Accelerating transition state energy calculations with pre- trained graph neural networks,

    B. Wander, M. Shuaibi, J. R. Kitchin, Z. W. Ulissi, and C. L. Zitnick, “Cattsunami: Accelerating transition state energy calculations with pre- trained graph neural networks,” 2024

  34. [34]

    Mofsimbench: evaluating uni- versal machine learning interatomic potentials in metal-organic frame- work molecular modeling,

    H. Kraß, J. Huang, and S. M. Moosavi, “Mofsimbench: evaluating uni- versal machine learning interatomic potentials in metal-organic frame- work molecular modeling,”npj Computational Materials, vol. 12, no. 1, p. 4, Dec 2025

  35. [35]

    One size fits all? development of the cposs209 data set of experimental and hypothetical polymorphs for testing computational modeling methods,

    L. S. Price, M. Paloni, M. Salvalaglio, and S. L. Price, “One size fits all? development of the cposs209 data set of experimental and hypothetical polymorphs for testing computational modeling methods,” Crystal Growth & Design, vol. 25, no. 9, pp. 3186–3209, 2025

  36. [36]

    A framework to evaluate machine learning crystal stability predictions,

    J. Riebesell, R. E. A. Goodall, P. Benner, Y . Chiang, B. Deng, G. Ceder, M. Asta, A. A. Lee, A. Jain, and K. A. Persson, “A framework to evaluate machine learning crystal stability predictions,”Nature Machine Intelligence, vol. 7, no. 6, pp. 836–847, Jun 2025

  37. [37]

    The md17 datasets from the perspective of datasets for gas-phase “small

    J. M. Bowman, C. Qu, R. Conte, A. Nandi, P. L. Houston, and Q. Yu, “The md17 datasets from the perspective of datasets for gas-phase “small” molecule potentials,”The Journal of Chemical Physics, vol. 156, no. 24, p. 240901, 06 2022. 11