Recognition: unknown
Breaking the Training Barrier of Billion-Parameter Universal Machine Learning Interatomic Potentials
Pith reviewed 2026-05-10 07:55 UTC · model grok-4.3
The pith
The Janus framework scales training of billion-parameter interatomic potential models to exascale performance, reducing time from weeks to hours.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MatRIS-MoE is presented as a billion-parameter Mixture-of-Experts model with invariant architecture for universal machine learning interatomic potentials. Janus is the high-dimensional distributed training framework equipped with hardware-aware optimizations that parallelize second-order derivative computations and communication. Deployed across two exascale supercomputers, the code attains 1.2/1.0 EFLOPS (24/35.5 percent of theoretical peak) in single precision at over 90 percent parallel efficiency, thereby shortening the training of billion-parameter uMLIPs from weeks to hours and supplying infrastructure for exascale AI-for-Science foundation models.
What carries the argument
Janus, the high-dimensional distributed training framework that applies hardware-aware optimizations to parallelize second-order derivative computations and communication for billion-parameter uMLIP training.
If this is right
- Training of billion-parameter universal interatomic potentials becomes feasible in hours on existing exascale hardware rather than weeks.
- Faster iteration is now possible when developing foundational models for quantum-accurate simulations of materials and molecules.
- The achieved performance sets a concrete benchmark for large-scale training of scientific machine learning models.
- Essential infrastructure is supplied for expanding the scale of universal interatomic potential training in AI-for-Science applications.
Where Pith is reading between the lines
- The optimizations for handling derivative computations at scale could be adapted to other scientific machine learning models that rely on similar higher-order calculations.
- Shorter training cycles may enable researchers to test larger model sizes or broader training datasets in materials and molecular science.
- The demonstrated efficiency suggests the framework could support more frequent retraining or domain-specific fine-tuning of potentials for targeted applications.
- Hardware-specific software approaches like this may become necessary to extract full value from future exascale and post-exascale machines in computational science.
Load-bearing premise
The hardware-aware optimizations in Janus successfully parallelize second-order derivative computations and communication without introducing numerical instability or accuracy loss in the resulting potentials.
What would settle it
A side-by-side validation on standard energy and force datasets where the billion-parameter models trained with Janus exhibit substantially higher prediction errors than smaller models trained by conventional methods, or a scaling run that fails to sustain the reported EFLOPS and efficiency without accuracy degradation.
Figures
read the original abstract
Universal Machine Learning Interatomic Potentials (uMLIPs), pre-trained on massively diverse datasets encompassing inorganic materials and organic molecules across the entire periodic table, serve as foundational models for quantum-accurate physical simulations. However, uMLIP training requires second-order derivatives, which lack corresponding parallel training frameworks; moreover, scaling to the billion-parameter regime causes explosive growth in computation and communication overhead, making its training a tremendous challenge. We introduce MatRIS-MoE, a billion-parameter Mixture-of-Experts model built upon invariant architecture, and {Janus}, a pioneering high-dimensional distributed training framework for uMLIPs with hardware-aware optimizations. Deployed across two Exascale supercomputers, our code attains a peak performance of 1.2/1.0 EFLOPS (24\%/{35.5\%} of theoretical peak) in single precision at over 90\% parallel efficiency, compressing the training of billion-parameter uMLIPs from weeks to hours. This work establishes a new high-water mark for AI-for-Science (AI4S) foundation models at Exascale and provides essential infrastructure for rapid scientific discovery.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces MatRIS-MoE, a billion-parameter Mixture-of-Experts model for universal machine learning interatomic potentials (uMLIPs) based on an invariant architecture, together with the Janus high-dimensional distributed training framework that incorporates hardware-aware optimizations for second-order derivatives and communication. It reports achieving 1.2/1.0 EFLOPS (24%/35.5% of theoretical peak) in single precision on two exascale supercomputers at >90% parallel efficiency, reducing training of billion-parameter uMLIPs from weeks to hours.
Significance. If the reported scaling and efficiency hold and the optimizations preserve model fidelity, the work would establish a new benchmark for exascale training of scientific foundation models, directly enabling faster development of quantum-accurate potentials for materials and molecular simulations.
major comments (1)
- Abstract: the headline performance numbers (1.2/1.0 EFLOPS, >90% efficiency, weeks-to-hours reduction) are presented without any accompanying validation that the Janus parallelization of second-order derivatives and all-reduce communication preserves numerical stability or uMLIP accuracy; no MAE/RMSE values on energies, forces or stresses versus DFT references, no ablation of optimized versus baseline runs, and no single-precision Hessian stability analysis are supplied, leaving the central claim that the framework is useful for quantum-accurate uMLIPs unverified.
Simulated Author's Rebuttal
We thank the referee for the constructive review and for identifying the need for explicit validation of the optimizations' effects on model accuracy and stability. We address this major comment below and commit to revisions that will incorporate the requested evidence.
read point-by-point responses
-
Referee: Abstract: the headline performance numbers (1.2/1.0 EFLOPS, >90% efficiency, weeks-to-hours reduction) are presented without any accompanying validation that the Janus parallelization of second-order derivatives and all-reduce communication preserves numerical stability or uMLIP accuracy; no MAE/RMSE values on energies, forces or stresses versus DFT references, no ablation of optimized versus baseline runs, and no single-precision Hessian stability analysis are supplied, leaving the central claim that the framework is useful for quantum-accurate uMLIPs unverified.
Authors: We agree that the abstract and main results emphasize scaling and efficiency without directly demonstrating that the Janus optimizations preserve uMLIP accuracy and numerical properties. The MatRIS-MoE model builds on an invariant architecture whose baseline accuracy has been established in related literature, and our parallelization is designed to be mathematically equivalent. However, the manuscript does not currently contain the requested ablations, MAE/RMSE comparisons, or single-precision Hessian analysis. In the revised version we will add: (i) MAE/RMSE values on energies, forces, and stresses versus DFT references for models trained with Janus versus a baseline implementation, (ii) an ablation study quantifying any differences in final model performance due to the second-order derivative and communication optimizations, and (iii) a stability analysis of Hessian computations in single precision. These additions will be placed in the results section and briefly referenced in a revised abstract to substantiate the claim that the framework supports quantum-accurate uMLIPs. revision: yes
Circularity Check
No circularity: performance claims rest on measured runtime and efficiency, not self-referential derivations
full rationale
This is an engineering systems paper introducing MatRIS-MoE architecture and the Janus distributed training framework. Central claims are empirical: achieved 1.2/1.0 EFLOPS at >90% parallel efficiency on exascale hardware, with training time reduced from weeks to hours. These rest on wall-clock measurements and FLOPS counters from actual deployments, not on any equation or parameter that is fitted to a subset and then renamed as a 'prediction.' No self-definitional loops, no uniqueness theorems imported from prior self-citations, and no ansatz smuggled via citation. The paper is self-contained against external benchmarks of runtime and scaling; the skeptic concern about missing accuracy numbers is a completeness issue, not circularity.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Invariant architectures preserve physical symmetries under rotations and translations
- domain assumption Hardware-aware optimizations can hide communication latency for second-order derivative tensors at exascale
invented entities (2)
-
MatRIS-MoE
no independent evidence
-
Janus
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Electronic Properties of MoS2 Nanoparticles,
T. Li and G. Galli, “Electronic Properties of MoS2 Nanoparticles,”The Journal of Physical Chemistry C, vol. 111, no. 44, pp. 16 192–16 196, Nov. 2007
2007
-
[2]
Identification of cathode materials for lithium batteries guided by first-principles calculations,
G. Ceder, Y .-M. Chiang, D. R. Sadoway, M. K. Aydinol, Y .-I. Jang, and B. Huang, “Identification of cathode materials for lithium batteries guided by first-principles calculations,”Nature, vol. 392, no. 6677, pp. 694–696, Apr. 1998
1998
-
[3]
Progress in material design for biomedical applications,
M. W. Tibbitt, C. B. Rodell, J. A. Burdick, and K. S. Anseth, “Progress in material design for biomedical applications,”Proceedings of the National Academy of Sciences, vol. 112, no. 47, pp. 14 444–14 451, 2015
2015
-
[4]
Neuroevolution machine learning potentials: Combining high accuracy and low cost in atomistic simulations and application to heat transport,
Z. Fan, Z. Zeng, C. Zhang, Y . Wang, K. Song, H. Dong, Y . Chen, and T. Ala-Nissila, “Neuroevolution machine learning potentials: Combining high accuracy and low cost in atomistic simulations and application to heat transport,”Physical Review B, vol. 104, no. 10, p. 104309, 2021
2021
-
[5]
E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials,
S. Batzner, A. Musaelian, L. Sun, M. Geiger, J. P. Mailoa, M. Kornbluth, N. Molinari, T. E. Smidt, and B. Kozinsky, “E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials,” Nature Communications, vol. 13, no. 1, May 2022
2022
-
[6]
Pushing the limit of molecular dynamics with ab initio accuracy to 100 million atoms with machine learning,
W. Jia, H. Wang, M. Chen, D. Lu, L. Lin, R. Car, W. E, and L. Zhang, “Pushing the limit of molecular dynamics with ab initio accuracy to 100 million atoms with machine learning,” inProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC ’20. IEEE Press, 2020. 10
2020
-
[7]
UMA: A family of universal models for atoms,
B. M. Wood, M. Dzamba, X. Fu, M. Gao, M. Shuaibi, L. Barroso- Luque, K. Abdelmaqsoud, V . Gharakhanyan, J. R. Kitchin, D. S. Levine, K. Michel, A. Sriram, T. Cohen, A. Das, S. J. Sahoo, A. Rizvi, Z. W. Ulissi, and C. L. Zitnick, “UMA: A family of universal models for atoms,” inThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025
2025
-
[8]
Optimizing cross- domain transfer for universal machine learning interatomic potentials,
J. Kim, J. You, Y . Park, Y . Lim, Y . Kang, J. Kim, H. Jeon, S. Ju, D. Hong, S. Y . Lee, S. Choi, Y . Kim, J. W. Lee, and S. Han, “Optimizing cross- domain transfer for universal machine learning interatomic potentials,” Nature Communications, Mar. 2026
2026
-
[9]
Chgnet as a pretrained universal neural network potential for charge-informed atomistic modelling,
B. Deng, P. Zhong, K. Jun, J. Riebesell, K. Han, C. J. Bartel, and G. Ceder, “Chgnet as a pretrained universal neural network potential for charge-informed atomistic modelling,”Nature Machine Intelligence, p. 1–11, 2023
2023
-
[10]
MatRIS: Toward reliable and efficient pretrained machine learning interatomic potentials,
Y . Zhou, S. Hu, X. Zhang, H. Wang, G. Tan, and W. Jia, “MatRIS: Toward reliable and efficient pretrained machine learning interatomic potentials,” inThe Fourteenth International Conference on Learning Representations, 2026
2026
-
[11]
The open polymers 2026 (opoly26) dataset and evaluations,
D. S. Levine, N. Liesen, L. Chua, J. Diffenderfer, H. Ingolfsson, M. P. Kroonblawd, N. Kumar, A. Maiti, S. S. Mohottalalage, M. Shuaibi, B. V . Essen, B. M. Wood, C. L. Zitnick, S. M. Blau, and E. R. Antoniuk, “The open polymers 2026 (opoly26) dataset and evaluations,” 2026
2026
-
[12]
Open molecular crystals 2025 (omc25) dataset and models,
V . Gharakhanyan, L. Barroso-Luque, Y . Yang, M. Shuaibi, K. Michel, D. S. Levine, M. Dzamba, X. Fu, M. Gao, X. Liu, H. Ni, K. Noori, B. M. Wood, M. Uyttendaele, A. Boromand, C. L. Zitnick, N. Marom, Z. W. Ulissi, and A. Sriram, “Open molecular crystals 2025 (omc25) dataset and models,” 2025
2025
-
[13]
The open dac 2025 dataset for sorbent discovery in direct air capture,
A. Sriram, L. M. Brabson, X. Yu, S. Choi, K. Abdelmaqsoud, E. Moubarak, P. de Haan, S. Löwe, J. Brehmer, J. R. Kitchin, M. Welling, C. L. Zitnick, Z. Ulissi, A. J. Medford, and D. S. Sholl, “The open dac 2025 dataset for sorbent discovery in direct air capture,” 2025
2025
-
[14]
S. J. Sahoo, M. Maraschin, D. S. Levine, Z. Ulissi, C. L. Zitnick, J. B. Varley, J. A. Gauthier, N. Govindarajan, and M. Shuaibi, “The open catalyst 2025 (oc25) dataset and models for solid-liquid interfaces,” arXiv preprint arXiv:2509.17862, 2025
-
[15]
Open catalyst 2020 (oc20) dataset and community challenges,
L. Chanussot, A. Das, S. Goyal, T. Lavril, M. Shuaibi, M. Riviere, K. Tran, J. Heras-Domingo, C. Ho, W. Huet al., “Open catalyst 2020 (oc20) dataset and community challenges,”Acs Catalysis, vol. 11, no. 10, pp. 6059–6072, 2021
2020
-
[16]
A foundational potential energy surface dataset for materials,
A. D. Kaplan, R. Liu, J. Qi, T. W. Ko, B. Deng, J. Riebesell, G. Ceder, K. A. Persson, and S. P. Ong, “A foundational potential energy surface dataset for materials,” 2025
2025
-
[17]
arXiv preprint arXiv:2410.12771 , year=
L. Barroso-Luque, M. Shuaibi, X. Fu, B. M. Wood, M. Dzamba, M. Gao, A. Rizvi, C. L. Zitnick, and Z. W. Ulissi, “Open materials 2024 (omat24) inorganic materials dataset and models,”arXiv preprint arXiv:2410.12771, 2024
-
[18]
The open molecules 2025 (omol25) dataset, evaluations, and models,
D. S. Levine, M. Shuaibi, E. W. C. Spotte-Smith, M. G. Taylor, M. R. Hasyim, K. Michel, I. Batatia, G. Csányi, M. Dzamba, P. Eastman, N. C. Frey, X. Fu, V . Gharakhanyan, A. S. Krishnapriyan, J. A. Rackers, S. Raja, A. Rizvi, A. S. Rosen, Z. Ulissi, S. Vargas, C. L. Zitnick, S. M. Blau, and B. M. Wood, “The open molecules 2025 (omol25) dataset, evaluation...
2025
-
[19]
A universal graph deep learning interatomic potential for the periodic table,
C. Chen and S. P. Ong, “A universal graph deep learning interatomic potential for the periodic table,”Nature Computational Science, vol. 2, no. 11, pp. 718–728, Nov. 2022
2022
-
[20]
Scaling deep learning for materials discovery,
A. Merchant, S. Batzner, S. S. Schoenholz, M. Aykol, G. Cheon, and E. D. Cubuk, “Scaling deep learning for materials discovery,”Nature, vol. 624, no. 7990, pp. 80–85, 2023
2023
-
[21]
Orb-v3: atomistic simulation at scale,
B. Rhodes, S. Vandenhaute, V . Šimkus, J. Gin, J. Godwin, T. Duignan, and M. Neumann, “Orb-v3: atomistic simulation at scale,” 2025
2025
-
[22]
Pushing the limits of unconstrained machine-learned interatomic potentials,
F. Bigi, P. Pegolo, A. Mazitov, and M. Ceriotti, “Pushing the limits of unconstrained machine-learned interatomic potentials,” 2026
2026
-
[23]
Cross learning between electronic structure theories for unifying molecular, surface, and inorganic crystal foundation force fields,
I. Batatia, C. Lin, J. Hart, E. Kasoar, A. M. Elena, S. W. Norwood, T. Wolf, and G. Csányi, “Cross learning between electronic structure theories for unifying molecular, surface, and inorganic crystal foundation force fields,” 2025
2025
-
[24]
Equiformerv2: Improved equivariant transformer for scaling to higher-degree representations,
Y .-L. Liao, B. M. Wood, A. Das, and T. Smidt, “Equiformerv2: Improved equivariant transformer for scaling to higher-degree representations,” inThe Twelfth International Conference on Learning Representations, 2024
2024
-
[25]
MACE: Higher order equivariant message passing neural networks for fast and accurate force fields,
I. Batatia, D. P. Kovacs, G. N. C. Simm, C. Ortner, and G. Csanyi, “MACE: Higher order equivariant message passing neural networks for fast and accurate force fields,” inAdvances in Neural Information Processing Systems, 2022
2022
-
[26]
A graph neural network for the era of large atomistic models,
D. Zhang, A. Peng, C. Cai, W. Li, Y . Zhou, J. Zeng, M. Guo, C. Zhang, B. Li, H. Jiang, T. Zhu, W. Jia, L. Zhang, and H. Wang, “A graph neural network for the era of large atomistic models,” 2026
2026
-
[27]
Towards training billion parameter graph neural networks for atomic simulations,
A. Sriram, A. Das, B. M. Wood, and C. L. Zitnick, “Towards training billion parameter graph neural networks for atomic simulations,” in International Conference on Learning Representations, 2022
2022
-
[28]
Laer-moe: Load-adaptive expert re-layout for efficient mixture-of-experts training,
X. Liu, Y . Wang, F. Fu, X. Xiao, H. Li, J. Li, and B. Cui, “Laer-moe: Load-adaptive expert re-layout for efficient mixture-of-experts training,” inASPLOS ’26, Volume 2, NY , USA, 2026, p. 1055–1072
2026
-
[29]
Wiggle150: Benchmarking density functionals and neural network potentials on highly strained conformers,
R. R. Brew, I. A. Nelson, M. Binayeva, A. S. Nayak, W. J. Simmons, J. J. Gair, and C. C. Wagen, “Wiggle150: Benchmarking density functionals and neural network potentials on highly strained conformers,”Journal of Chemical Theory and Computation, vol. 21, no. 8, pp. 3922–3929, 2025, pMID: 40211427
2025
-
[30]
A look at the density functional theory zoo with the advanced gmtkn55 database for general main group thermochemistry, kinetics and noncova- lent interactions,
L. Goerigk, A. Hansen, C. Bauer, S. Ehrlich, A. Najibi, and S. Grimme, “A look at the density functional theory zoo with the advanced gmtkn55 database for general main group thermochemistry, kinetics and noncova- lent interactions,”Phys. Chem. Chem. Phys., vol. 19, pp. 32 184–32 215, 2017
2017
-
[31]
Understanding the role of vibrations, exact exchange, and many-body van der waals interactions in the cohesive properties of molecular crystals,
A. M. Reilly and A. Tkatchenko, “Understanding the role of vibrations, exact exchange, and many-body van der waals interactions in the cohesive properties of molecular crystals,”The Journal of Chemical Physics, vol. 139, no. 2, p. 024705, 07 2013
2013
-
[32]
Dmc-ice13: Ambient and high pressure polymorphs of ice from diffusion monte carlo and density functional theory,
F. Della Pia, A. Zen, D. Alfè, and A. Michaelides, “Dmc-ice13: Ambient and high pressure polymorphs of ice from diffusion monte carlo and density functional theory,”The Journal of Chemical Physics, vol. 157, no. 13, p. 134701, 10 2022
2022
-
[33]
Cattsunami: Accelerating transition state energy calculations with pre- trained graph neural networks,
B. Wander, M. Shuaibi, J. R. Kitchin, Z. W. Ulissi, and C. L. Zitnick, “Cattsunami: Accelerating transition state energy calculations with pre- trained graph neural networks,” 2024
2024
-
[34]
Mofsimbench: evaluating uni- versal machine learning interatomic potentials in metal-organic frame- work molecular modeling,
H. Kraß, J. Huang, and S. M. Moosavi, “Mofsimbench: evaluating uni- versal machine learning interatomic potentials in metal-organic frame- work molecular modeling,”npj Computational Materials, vol. 12, no. 1, p. 4, Dec 2025
2025
-
[35]
One size fits all? development of the cposs209 data set of experimental and hypothetical polymorphs for testing computational modeling methods,
L. S. Price, M. Paloni, M. Salvalaglio, and S. L. Price, “One size fits all? development of the cposs209 data set of experimental and hypothetical polymorphs for testing computational modeling methods,” Crystal Growth & Design, vol. 25, no. 9, pp. 3186–3209, 2025
2025
-
[36]
A framework to evaluate machine learning crystal stability predictions,
J. Riebesell, R. E. A. Goodall, P. Benner, Y . Chiang, B. Deng, G. Ceder, M. Asta, A. A. Lee, A. Jain, and K. A. Persson, “A framework to evaluate machine learning crystal stability predictions,”Nature Machine Intelligence, vol. 7, no. 6, pp. 836–847, Jun 2025
2025
-
[37]
The md17 datasets from the perspective of datasets for gas-phase “small
J. M. Bowman, C. Qu, R. Conte, A. Nandi, P. L. Houston, and Q. Yu, “The md17 datasets from the perspective of datasets for gas-phase “small” molecule potentials,”The Journal of Chemical Physics, vol. 156, no. 24, p. 240901, 06 2022. 11
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.