Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models
Pith reviewed 2026-05-16 23:39 UTC · model grok-4.3
The pith
Open Materials 2024 supplies 110 million DFT calculations and EquiformerV2 models that reach F1 scores above 0.9 for ground-state stability.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The OMat24 dataset contains more than 110 million DFT calculations focused on inorganic structural and compositional diversity, and the accompanying EquiformerV2 models achieve state-of-the-art performance on the Matbench Discovery leaderboard while predicting ground-state stability with an F1 score above 0.9 and formation energies with an accuracy of 20 meV/atom.
What carries the argument
EquiformerV2 graph neural network models trained on the large-scale OMat24 DFT dataset, with auxiliary denoising objectives and fine-tuning across multiple materials datasets.
If this is right
- High-accuracy stability predictions can be used to screen millions of candidate structures before any DFT or experiment.
- Larger models and auxiliary denoising tasks improve accuracy across OMat24, MPtraj, and Alexandria, indicating scalable training strategies.
- Open data and models allow fine-tuning on domain-specific datasets to reach usable performance on formation energies and stability.
- Community access removes the prior barrier of proprietary training data, enabling faster iteration on AI-assisted materials design.
Where Pith is reading between the lines
- If the accuracy holds on experimental benchmarks, the approach could shorten the cycle from composition idea to stable candidate by orders of magnitude for climate-relevant materials.
- The same training recipe may extend to other properties such as electronic band gaps or mechanical moduli once additional labels are added to the dataset.
- Combining these models with active-learning loops could further reduce the number of expensive DFT calculations needed for new discoveries.
Load-bearing premise
Density functional theory calculations supply accurate enough representations of real material ground-state stabilities and formation energies, and the models generalize reliably to materials outside the training set.
What would settle it
Direct experimental synthesis and stability measurement of several high-confidence predictions for previously unseen compositions, or comparison of model energies against higher-accuracy methods such as quantum Monte Carlo on a held-out test set.
read the original abstract
The ability to discover new materials with desirable properties is critical for numerous applications from helping mitigate climate change to advances in next generation computing hardware. AI has the potential to accelerate materials discovery and design by more effectively exploring the chemical space compared to other computational methods or by trial-and-error. While substantial progress has been made on AI for materials data, benchmarks, and models, a barrier that has emerged is the lack of publicly available training data and open pre-trained models. To address this, we present a Meta FAIR release of the Open Materials 2024 (OMat24) large-scale open dataset and an accompanying set of pre-trained models. OMat24 contains over 110 million density functional theory (DFT) calculations focused on structural and compositional diversity. Our EquiformerV2 models achieve state-of-the-art performance on the Matbench Discovery leaderboard and are capable of predicting ground-state stability and formation energies to an F1 score above 0.9 and an accuracy of 20 meV/atom, respectively. We explore the impact of model size, auxiliary denoising objectives, and fine-tuning on performance across a range of datasets including OMat24, MPtraj, and Alexandria. The open release of the OMat24 dataset and models enables the research community to build upon our efforts and drive further advancements in AI-assisted materials science.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the Open Materials 2024 (OMat24) dataset containing over 110 million DFT calculations on inorganic materials, with emphasis on structural and compositional diversity. It releases pre-trained EquiformerV2 models and reports that these achieve state-of-the-art performance on the independent Matbench Discovery leaderboard, with F1 scores above 0.9 for ground-state stability and 20 meV/atom accuracy for formation energies. The work also examines the effects of model scale, auxiliary denoising objectives, and fine-tuning across OMat24, MPtraj, and Alexandria datasets.
Significance. The open release of a large-scale DFT dataset and accompanying pre-trained models is a clear strength that can accelerate community progress in AI-assisted materials discovery. If the reported generalization performance holds, the results would mark a meaningful advance in predictive accuracy for stability and formation energies. The use of an external public leaderboard for evaluation is a positive design choice that reduces circularity risk.
major comments (2)
- [Abstract] Abstract: the central claim of SOTA performance (F1 > 0.9 and 20 meV/atom) on Matbench Discovery rests on the assumption of no data leakage, yet the abstract provides no description of deduplication protocols, compositional splits, fingerprint-based filtering, or any exclusion of Matbench test compositions/structures from OMat24.
- [Results] Results section (performance reporting): the headline metrics are given without error bars, detailed data-split descriptions, or explicit confirmation that OMat24 construction avoided overlap with the Matbench Discovery test partition, which is load-bearing for the generalization claim given that both draw from the same inorganic DFT ecosystem.
minor comments (1)
- [Methods] The manuscript would benefit from a dedicated subsection or table summarizing the exact train/validation/test splits used for OMat24 and any hyperparameter selection procedures.
Simulated Author's Rebuttal
We thank the referee for the constructive comments highlighting the need for greater transparency around data leakage prevention and performance reporting. These points strengthen the manuscript, and we have revised the abstract and results section to address them directly while preserving the original scientific claims.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim of SOTA performance (F1 > 0.9 and 20 meV/atom) on Matbench Discovery rests on the assumption of no data leakage, yet the abstract provides no description of deduplication protocols, compositional splits, fingerprint-based filtering, or any exclusion of Matbench test compositions/structures from OMat24.
Authors: We agree that the abstract should explicitly reference the deduplication steps to support the generalization claim. In the revised version we have added one sentence to the abstract stating that OMat24 was constructed with compositional and structural deduplication (via fingerprint-based filtering and exclusion of Matbench test compositions) to ensure no overlap with the Matbench Discovery test partition. These protocols were already described in the Methods and SI of the original submission; the revision simply makes them visible at the abstract level. revision: yes
-
Referee: [Results] Results section (performance reporting): the headline metrics are given without error bars, detailed data-split descriptions, or explicit confirmation that OMat24 construction avoided overlap with the Matbench Discovery test partition, which is load-bearing for the generalization claim given that both draw from the same inorganic DFT ecosystem.
Authors: We accept the referee’s observation. The revised results section now includes (i) error bars on all headline F1 and MAE values, (ii) an expanded paragraph detailing the train/validation/test splits and the exact filtering criteria applied during OMat24 construction, and (iii) an explicit statement confirming that no Matbench Discovery test compositions or structures were included in OMat24. These additions were moved from the SI into the main text for clarity; the underlying data-handling procedures remain unchanged. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper introduces the OMat24 dataset of 110M DFT calculations and reports EquiformerV2 model performance on the external Matbench Discovery leaderboard (F1 > 0.9, 20 meV/atom). No load-bearing step reduces to a self-definition, fitted parameter renamed as prediction, or self-citation chain; the benchmark evaluation is independent of the training data construction described. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- EquiformerV2 model hyperparameters and training schedule
axioms (1)
- domain assumption Density functional theory calculations yield sufficiently accurate ground-state energies and stabilities for inorganic materials
Forward citations
Cited by 30 Pith papers
-
SLayerGen: a Crystal Generative Model for all Space and Layer Groups
SLayerGen generates crystals invariant to any space or layer group via autoregressive lattice and Wyckoff sampling plus equivariant diffusion, achieving gains over bulk models on diperiodic materials after correcting ...
-
JanusPipe: Efficient Pipeline Parallel Training for Machine Learning Interatomic Potentials
JanusPipe introduces SymFold and WaveK to enable efficient 3D-parallel training for conservative MLIPs, reporting 1.51x and 1.45x average throughput gains over 1F1B and Hanayo baselines on 32 GPUs.
-
Lang2MLIP: End-to-End Language-to-Machine Learning Interatomic Potential Development with Autonomous Agentic Workflows
Lang2MLIP is an LLM multi-agent framework that automates end-to-end development of machine learning interatomic potentials from natural language input for heterogeneous materials systems.
-
Breaking the Training Barrier of Billion-Parameter Universal Machine Learning Interatomic Potentials
MatRIS-MoE and Janus enable efficient exascale training of billion-parameter universal interatomic potentials by addressing second-order derivative computation and communication overheads.
-
Atomistic Machine Learning with Irreducible Cartesian Natural Tensors
CarNet develops irreducible Cartesian natural tensors and an equivariant model that matches leading spherical-tensor performance for ML interatomic potentials and high-rank tensor predictions like elastic constants.
-
Teachers that teach the irrelevant: Pre-training machine learned interaction potentials with classical force fields for robust molecular dynamics simulations
Pre-training ML interaction potentials on classical force fields followed by ab initio fine-tuning produces stable and accurate molecular dynamics simulations for gas-phase molecules, liquid water, and hydrogen combustion.
-
JanusPipe: Efficient Pipeline Parallel Training for Machine Learning Interatomic Potentials
JanusPipe is a new 3D-parallel training system for conservative MLIPs that uses SymFold and WaveK to achieve 1.51x and 1.45x average throughput gains over 1F1B and Hanayo on 32 GPUs.
-
CrystalREPA: Transferring Physical Priors from Universal MLIPs to Crystal Generative Models
CrystalREPA closes the representation gap between crystal generators and universal MLIPs via contrastive alignment, yielding more stable and valid generated crystals while revealing that MLIP teacher quality is better...
-
Compact SO(3) Equivariant Atomistic Foundation Models via Structural Pruning
Structural pruning of SO(3) equivariant atomistic models from large checkpoints yields 1.5-4x fewer parameters and 2.5-4x less pre-training compute than small models trained from scratch, while outperforming them on m...
-
MatterSim-MT: A multi-task foundation model for in silico materials characterization
MatterSim-MT is a foundation model pretrained on over 35 million first-principles structures that predicts material structure, dynamics, and thermodynamics while enabling multi-task simulations of phonon splitting, fe...
-
Density diversity in training data governs thermodynamic transferability of machine learning interatomic potentials
Density diversity in training data is the key factor for making machine learning interatomic potentials transferable across thermodynamic states, outperforming temperature diversity.
-
VibroML: an automated toolkit for high-throughput vibrational analysis and dynamic instability remediation of crystalline materials using machine-learned potentials
VibroML automates remediation of dynamic instabilities in crystalline materials by combining MLIPs with genetic algorithms for polymorph search, finite-temperature MD validation, and compositional alloying to yield st...
-
Errors that matter: Uncertainty-aware universal machine-learning potentials calibrated on experiments
PET-UAFD ensemble of ML potentials, calibrated on experimental cohesive energies and moduli, matches experimental accuracy on liquid properties and supplies uncertainty estimates via the PET-EXP protocol.
-
Agentic Fusion of Large Atomic and Language Models to Accelerate Superconductor Discovery
An agentic framework fusing large atomic and language models rediscovers 66 known superconductors and guides experimental verification of four new ones with transition temperatures from 2.5 K to 6.5 K.
-
AI-Driven Expansion and Application of the Alexandria Database
A combined generative model, ML potential, and graph neural network pipeline expands the Alexandria database by 1.3 million DFT-validated compounds with 99% success near the convex hull and releases training data for ...
-
An experimentally validated end-to-end framework for operando modeling of intrinsically complex metallosilicates
An end-to-end framework combining domain separation, lightweight ML potentials, and de novo in silico synthesis enables quantitative atomistic modeling of mesoporous metallosilicates that matches experimental densitie...
-
Machine Learning Phonon Spectra for Fast and Accurate Optical Lineshapes of Defects
Machine learning interatomic potentials fine-tuned on first-principles relaxation data accurately reproduce phonon spectra and optical lineshapes for defects, matching explicit calculations and experiments.
-
Systematic Fine-Tuning of MACE Interatomic Potentials for Catalysis
Fine-tuned MACE MLIPs achieve lower mean absolute errors on catalytic reaction energies and barriers than from-scratch models, with a large fine-tuned model performing best on both metallic and oxide systems including...
-
OptiMat Alloys: a FAIR, living database of multi-principal element alloys enabled by a conversational agent
OptiMat Alloys is a conversational AI system that maintains a living FAIR database of multi-principal element alloy calculations and enables natural-language, on-demand computations with built-in uncertainty checks.
-
Accuracy and Efficiency Benchmarks of Pretrained Machine Learning Potentials for Molecular Simulations
Benchmarks of 15 MLIPs show parameter count and training set size correlate with accuracy, architecture drives speed and memory, and explicit Coulomb terms provide no benefit.
-
Comparing the latent features of universal machine-learning interatomic potentials
Different uMLIPs encode chemical space in distinct ways, with high cross-model feature reconstruction errors, and fine-tuning preserves strong pre-training bias in the latent features.
-
Tailored Vapor Deposition Unlocks Large-Grain, Wafer-Scale Epitaxial Growth of 2D Magnetic CrCl3
Centimeter-scale epitaxial growth of phase-pure crystalline 2D CrCl3 films achieved on mica via controlled physical vapor transport with innovations in light management, high carrier-gas flow, and moisture control.
-
Position: Graph Condensation Needs a Reset -- Move Beyond Full-dataset Training and Model-Dependence
The paper claims current graph condensation approaches are flawed due to full-dataset training requirements, high overhead, poor generalization, and misleading evaluation metrics, calling for a reset toward lightweigh...
-
Assessing foundational atomistic models for iron alloys under Earth's core conditions
Foundational atomistic models reproduce some structural and dynamical properties of iron alloys under core conditions but none consistently match first-principles benchmarks due to missing explicit treatment of therma...
-
Accurate and Efficient Interatomic Potentials for Dislocations in InP
New ACE and MACE potentials for InP achieve at most 4% error on partial dislocation formation energies versus DFT, outperforming literature models by factors of 4-12 while being computationally faster.
-
Machine Learning Interatomic Potentials for Million-Atom Simulations of Multicomponent Alloys
GRACE MLIPs train faster and predict alloy properties more accurately than NEP, but NEP's 60-fold speed advantage enables reliable million-atom simulations of shock propagation when paired with ensemble uncertainty qu...
-
Comparing fine-tuning strategies of MACE machine learning force field for modeling Li-ion diffusion in LiF for batteries
MACE-MPA-0 predicts Li diffusion Ea of 0.22 eV in LiF, fine-tuned version with 300 points gives 0.20 eV, close to DeePMD reference of 0.24 eV, using far less training data.
-
Atomistic Modeling of Chemical Disorder in Materials: Bridging Classical Methods and AI-Assisted Approaches
A review of classical and AI-assisted methods for modeling chemical disorder in atomistic simulations of alloys and complex materials.
-
Position: Graph Condensation Needs a Reset -- Move Beyond Full-dataset Training and Model-Dependence
Graph condensation methods must move beyond full-dataset training and model dependence toward lightweight, architecture-agnostic designs to achieve practical efficiency in GNNs.
-
Inverse Design of Inorganic Compounds with Generative AI
A review of generative AI for inverse design of inorganic compounds, analyzing adaptations for their complexity in composition, geometry, symmetry, and electronic structure, with discussion of future benchmarks and sy...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.