MatMind: A Structure-Activity Knowledge-Driven Generative Foundation Model for Materials Science

Boxuan Zhang; Dajun Zeng; Jiahui Shi; Jianjun Liu; Jingyuan Shu; Linjing Li; Rongyan Wang; Tingwei Chen; Xiaolin Zhao; Xiaoyu Wu

arxiv: 2606.07712 · v1 · pith:LSDDFENUnew · submitted 2026-06-05 · ❄️ cond-mat.mtrl-sci · cs.AI

MatMind: A Structure-Activity Knowledge-Driven Generative Foundation Model for Materials Science

Zhan'ao Yao , Boxuan Zhang , Jingyuan Shu , Xiaoyu Wu , Rongyan Wang , Linjing Li , Dajun Zeng , Yudong Yao

show 5 more authors

Tingwei Chen Youwei Wang Xiaolin Zhao Jiahui Shi Jianjun Liu

This is my paper

Pith reviewed 2026-06-27 21:30 UTC · model grok-4.3

classification ❄️ cond-mat.mtrl-sci cs.AI

keywords crystal materialsgenerative foundation modelproperty predictionstructure generationphysics-informed reinforcement learningmaterials sciencelarge language model

0 comments

The pith

MatMind unifies crystal property prediction and generation in one model that beats specialized graph networks on energy, modulus, and band gap.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MatMind as a generative foundation model that combines structure-activity knowledge with physics-informed training to handle both numerical property prediction and crystal structure generation. It reports lower mean absolute errors than dedicated graph neural networks on energy above hull, bulk modulus, and band gap, while reaching 65.3 percent stable-unique-novel rate in unconditional generation and strong gains in a low-data conditioned generation task. A sympathetic reader would care because this suggests a single model can replace multiple narrow specialists across materials tasks without apparent performance trade-offs. The work argues that large language model architectures, when trained progressively with dual heads and reinforcement learning, create a shared representation space suitable for the full range of crystal problems.

Core claim

MatMind attains the lowest mean absolute error on energy above hull, bulk modulus, and band gap, surpassing graph neural network predictors purpose-built for these tasks, reaches an S.U.N. rate of 65.3 percent on unconditional crystal generation, and achieves a comparable multiplicative improvement on magnetization-density-conditioned generation where only 21 positive samples exist within over 600000 training entries.

What carries the argument

The progressive training framework that combines structure-activity knowledge injection, dual-head joint training of language reasoning and numerical regression, and multi-objective physics-informed reinforcement learning over stability, novelty, and structural diversity.

If this is right

The model serves as a viable single backbone for crystal materials science across prediction and generation tasks.
Structure-activity knowledge injection and physics-informed reinforcement learning allow matching narrow specialists without separate architectures.
The approach succeeds on conditioned generation even when positive examples are extremely rare in the training data.
Unconditional generation reaches 65.3 percent stable-unique-novel rate within the unified model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same training recipe might transfer to other structured scientific domains that mix symbolic and numerical data.
Future tests could check whether the shared representation remains stable when the model is scaled or fine-tuned on experimental rather than computed data.
If the dual-head design truly avoids trade-offs, adding new regression targets should not degrade generation quality.

Load-bearing premise

The progressive training framework produces a shared representation that enables the reported performance gains without task-specific trade-offs or overfitting.

What would settle it

An independent evaluation on a standard benchmark set showing that MatMind's mean absolute error on any of the three properties exceeds that of the best dedicated graph neural network would falsify the claim of unified superiority.

read the original abstract

Progress in AI-driven crystal materials science has so far been carried by narrow architectures purpose-built for individual tasks -- graph neural networks for property prediction, diffusion and flow-matching models for crystal generation -- each excelling within its niche yet unable to act as a shared backbone across the full spectrum of materials problems. Generative large language models offer a fundamentally different paradigm, in which structural representation, quantitative prediction, and structure-activity reasoning can be unified within one model, but the materials community has yet to see this paradigm realized at a level competitive with established narrow specialists. Here we present MatMind, a generative foundation model purpose-built for crystal materials science under this paradigm, developed through the coordinated activation of structure-activity knowledge and physics-informed feedback within a progressive training framework -- combining structure-activity knowledge injection, a dual-head architecture that jointly trains language reasoning and numerical regression in a shared representation space, and multi-objective physics-informed reinforcement learning over stability, novelty, and structural diversity. Across three task families, MatMind attains the lowest mean absolute error on energy above hull, bulk modulus, and band gap -- surpassing graph neural network predictors purpose-built for these tasks -- reaches an S.U.N. rate of 65.3% on unconditional crystal generation, and achieves a comparable multiplicative improvement on magnetization-density-conditioned generation, where only 21 positive samples exist within over 600000 training entries. By matching or surpassing narrow specialists on their own ground while operating within a single unified model, MatMind shows that the LLM-based paradigm can serve as a viable backbone for crystal materials science going forward.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MatMind claims one LLM backbone can beat task-specific GNNs on prediction and deliver strong generation, but the abstract leaves the supporting experiments thin.

read the letter

MatMind claims one LLM backbone can beat task-specific GNNs on prediction and deliver strong generation, but the abstract leaves the supporting experiments thin.

The new element is the progressive training stack: structure-activity knowledge injection, a dual-head setup that trains language reasoning and numerical regression in the same space, and then multi-objective physics-informed RL on stability, novelty, and diversity. That combination is the concrete technical step beyond prior LLM-for-materials work.

The paper does a clean job stating the limitation of narrow architectures and showing where a shared model could help. The reported numbers—lowest MAE on energy above hull, bulk modulus, and band gap, plus 65.3% S.U.N. on unconditional generation and gains in the 21-sample magnetization-conditioned case—are the results worth checking.

The soft spots sit in the evaluation. No data splits, baseline details, error bars, or ablation on the RL reward weights appear in the abstract, so it is not yet possible to judge whether the gains are fair or whether the RL stage simply overfit the reported metrics. The central assumption that the shared representation avoids task trade-offs also needs direct controls to hold.

This is for groups already working on AI methods for crystal materials who want to test whether an LLM route can replace the current GNN-plus-diffusion toolkit. A reader focused on unified architectures would get the most from the training framework description.

I would send it to peer review so the methods and comparisons can be examined in full.

Referee Report

0 major / 2 minor

Summary. The manuscript introduces MatMind, a generative foundation model for crystal materials science developed via a progressive training framework that combines structure-activity knowledge injection, dual-head joint training of language reasoning and numerical regression, and multi-objective physics-informed reinforcement learning over stability, novelty, and structural diversity. It claims to achieve the lowest MAE on energy above hull, bulk modulus, and band gap (surpassing purpose-built GNN predictors), a 65.3% S.U.N. rate on unconditional crystal generation, and comparable gains on magnetization-density-conditioned generation despite only 21 positive samples in a >600k-entry training set.

Significance. If the reported results hold under rigorous validation, the work would be significant for establishing that a single LLM-based model can serve as a viable shared backbone across property prediction and generation tasks in materials science, matching or exceeding narrow specialists without evident task-specific trade-offs. The conditioned-generation result with extremely sparse positive data is a notable strength if the evaluation protocol is sound.

minor comments (2)

[Abstract and §4] Abstract and results sections: the strong performance claims would be strengthened by explicit reporting of data splits, baseline implementations, error bars, and validation procedures (including how the 21-sample conditioned regime was handled) to allow direct assessment against the GNN comparators.
[Methods] The multi-objective RL component lists reward weights as free parameters; a short sensitivity analysis or ablation on these weights would clarify robustness of the shared representation claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the work's potential significance as a unified backbone, and recommendation for minor revision. We are encouraged by the assessment that the conditioned-generation result with sparse data is a notable strength if the protocol is sound. No specific major comments were enumerated in the report, so our response below is limited accordingly.

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The provided abstract and context describe a progressive training framework (knowledge injection + dual-head regression + multi-objective RL) and report empirical performance gains on prediction and generation tasks. No equations, self-citations, or training details are exhibited that reduce any claimed prediction or result to a fitted input by construction, nor is there a load-bearing self-citation chain or ansatz smuggled via prior work. The central claims rest on reported MAE improvements and S.U.N. rates as external outcomes rather than definitional equivalences. This is the expected non-finding for a methods paper whose results are presented as benchmark comparisons without internal reduction.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The abstract provides no explicit free parameters, axioms, or invented entities, but the central claim structurally depends on the unstated effectiveness of the three training components (knowledge injection, dual-head sharing, and multi-objective RL) whose internal weights and selection criteria are not detailed.

free parameters (1)

multi-objective RL reward weights
The reinforcement learning over stability, novelty, and structural diversity requires weighting these objectives; these weights are not specified and must be chosen or fitted.

axioms (2)

domain assumption The dual-head architecture can maintain a shared representation space that jointly supports language reasoning and numerical regression without performance trade-offs.
Invoked when the abstract states that the dual-head jointly trains the two capabilities in a shared space.
domain assumption Physics-informed feedback during RL produces stable, novel, and diverse crystals that generalize beyond the training distribution.
Invoked in the description of multi-objective physics-informed reinforcement learning.

pith-pipeline@v0.9.1-grok · 5862 in / 1643 out tokens · 28945 ms · 2026-06-27T21:30:46.198807+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

84 extracted references · 4 canonical work pages

[1]

Machine Learning: Science and Technology6(3), 030701 (2025)

Zimmermann, Y., Bazgir, A., Al-Feghali, A., Ansari, M., Bocarsly, J., Brinson, 22 L.C., Chiang, Y., Circi, D., Chiu, M.-H., Daelman, N.,et al.: 32 examples of llm applications in materials science and chemistry: towards automation, assis- tants, agents, and accelerated scientific discovery. Machine Learning: Science and Technology6(3), 030701 (2025)

2025
[2]

Npj Computational Materials11(1), 61 (2025)

Pyzer-Knapp, E.O., Manica, M., Staar, P., Morin, L., Ruch, P., Laino, T., Smith, J.R., Curioni, A.: Foundation models for materials discovery–current state and future directions. Npj Computational Materials11(1), 61 (2025)

2025
[3]

arXiv preprint arXiv:2305.05708 (2023)

Flam-Shepherd, D., Aspuru-Guzik, A.: Language models can generate molecules, materials, and protein binding sites directly in three dimensions as xyz, cif, and pdb files. arXiv preprint arXiv:2305.05708 (2023)

arXiv 2023
[4]

Alampara, N., Miret, S., Jablonka, K.M.: Mattext: Do language models need more than text & scale for materials modeling? arXiv preprint arXiv:2406.17295 (2024)

arXiv 2024
[5]

Iscience24(3) (2021)

Kononova, O., He, T., Huo, H., Trewartha, A., Olivetti, E.A., Ceder, G.: Oppor- tunities and challenges of text mining in materials research. Iscience24(3) (2021)

2021
[6]

Beltagy, I., Lo, K., Cohan, A.: Scibert: A pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natu- ral Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3615–3620 (2019)

2019
[7]

npj Computational Materials11(1), 186 (2025)

Niyongabo Rubungo, A., Arnold, C., Rand, B.P., Dieng, A.B.: Llm-prop: pre- dicting the properties of crystalline materials using large language models. npj Computational Materials11(1), 186 (2025)

2025
[8]

Nature Communications15(1), 10570 (2024)

Antunes, L.M., Butler, K.T., Grau-Crespo, R.: Crystal structure generation with autoregressive large language modeling. Nature Communications15(1), 10570 (2024)

2024
[9]

Available at SSRN 3950755 (2021)

Walker, N., Trewartha, A., Huo, H., Lee, S., Cruse, K., Dagdelen, J., Dunn, A., Persson, K., Ceder, G., Jain, A.: The impact of domain-specific pre-training on named entity recognition tasks in materials science. Available at SSRN 3950755 (2021)

2021
[10]

Park, H., Li, Z., Walsh, A.: Has generative artificial intelligence solved inverse materials design? Matter7(7), 2355–2367 (2024)

2024
[11]

Polymer Journal54(8), 957–967 (2022)

Amamoto, Y.: Data-driven approaches for structure-property relationships in polymer science for prediction and understanding. Polymer Journal54(8), 957–967 (2022)

2022
[12]

arXiv preprint arXiv:2605.14344 (2026)

Wu, Y., Falletta, S., McGrath, D., Yang, S.: Crystalreasoner: Reasoning 23 and rl for property-conditioned crystal structure generation. arXiv preprint arXiv:2605.14344 (2026)

Pith/arXiv arXiv 2026
[13]

arXiv preprint arXiv:2504.02367 (2025)

Cao, Z., Wang, L.: Crystalformer-rl: Reinforcement fine-tuning for materials design. arXiv preprint arXiv:2504.02367 (2025)

arXiv 2025
[14]

npj Computational Materials10(1), 287 (2024)

Karpovich, C., Pan, E., Olivetti, E.A.: Deep reinforcement learning for inverse inorganic materials design. npj Computational Materials10(1), 287 (2024)

2024
[15]

arXiv preprint arXiv:2512.04562 (2025)

Betala, S., Gleason, S.P., Ramlaoui, A., Xu, A., Channing, G., Levy, D., Fourrier, C., Kazeev, N., Joshi, C.K., Kaba, S.-O., et al.: Lemat-genbench: A unified eval- uation framework for crystal generative models. arXiv preprint arXiv:2512.04562 (2025)

arXiv 2025
[16]

In: Uncertainty in Artificial Intelligence, pp

Das, K., Goyal, P., Lee, S.-C., Bhattacharjee, S., Ganguly, N.: Crysmmnet: multi- modal representation for crystal property prediction. In: Uncertainty in Artificial Intelligence, pp. 507–517 (2023). PMLR

2023
[17]

Physical review letters 120(14), 145301 (2018)

Xie, T., Grossman, J.C.: Crystal graph convolutional neural networks for an accu- rate and interpretable prediction of material properties. Physical review letters 120(14), 145301 (2018)

2018
[18]

Nature Computational Science2(11), 718–728 (2022)

Chen, C., Ong, S.P.: A universal graph deep learning interatomic potential for the periodic table. Nature Computational Science2(11), 718–728 (2022)

2022
[19]

Nature639(8055), 624–632 (2025)

Zeni, C., Pinsler, R., Z¨ ugner, D., Fowler, A., Horton, M., Fu, X., Wang, Z., Shysheya, A., Crabb´ e, J., Ueda, S.,et al.: A generative model for inorganic materials design. Nature639(8055), 624–632 (2025)

2025
[20]

Advances in Neural Information Processing Systems36, 17464–17497 (2023)

Jiao, R., Huang, W., Lin, P., Han, J., Chen, P., Lu, Y., Liu, Y.: Crystal struc- ture prediction by joint equivariant diffusion. Advances in Neural Information Processing Systems36, 17464–17497 (2023)

2023
[21]

Current Opinion in Chemical Engineering36, 100778 (2022) https: //doi.org/10.1016/j.coche.2021.100778

Nandy, A., Duan, C., Kulik, H.J.: Audacity of huge: overcoming challenges of data scarcity and data quality for machine learning in computational materials discovery. Current Opinion in Chemical Engineering36, 100778 (2022) https: //doi.org/10.1016/j.coche.2021.100778

work page doi:10.1016/j.coche.2021.100778 2022
[22]

Hugging Face

ScienceOne-AI: S1-Base: Scientific Foundation Model. Hugging Face. Accessed: 2025 (2025)

2025
[23]

Scientific data5(1), 180062 (2018)

Ghahremanpour, M.M., Van Maaren, P.J., Van Der Spoel, D.: The alexan- dria library, a quantum-chemical database of molecular properties for force field development. Scientific data5(1), 180062 (2018)

2018
[24]

Foundations of Crystallography 24 47(6), 655–685 (1991)

Hall, S.R., Allen, F.H., Brown, I.D.: The crystallographic information file (cif): a new standard archive file for crystallography. Foundations of Crystallography 24 47(6), 655–685 (1991)

1991
[25]

(ed.): International Tables for Crystallography, Volume A: Space- Group Symmetry, 6th edn

Aroyo, M.I. (ed.): International Tables for Crystallography, Volume A: Space- Group Symmetry, 6th edn. Springer, Dordrecht (2016). https://doi.org/10.1107/ 97809553602060000114

2016
[26]

Gordon and Breach Science Publishers, New York (1965)

Kovalev, O.V.: The Analytical Expression of the Results of the Theory of Space Groups. Gordon and Breach Science Publishers, New York (1965)

1965
[27]

APL materials 1(1) (2013)

Jain, A., Ong, S.P., Hautier, G., Chen, W., Richards, W.D., Dacek, S., Cholia, S., Gunter, D., Skinner, D., Ceder, G., et al.: Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL materials 1(1) (2013)

2013
[28]

arXiv preprint arXiv:2412.09560 (2024)

Mishra, V., Singh, S., Ahlawat, D., Zaki, M., Bihani, V., Grover, H.S., Mishra, B., Miret, S., Krishnan, N., et al.: Foundational large language models for materials research. arXiv preprint arXiv:2412.09560 (2024)

arXiv 2024
[29]

Advances in neural information processing systems35, 24824–24837 (2022)

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D.,et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems35, 24824–24837 (2022)

2022
[30]

In: Proceedings of the 25th International Conference on Machine Learning, pp

Collobert, R., Weston, J.: A unified architecture for natural language process- ing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167 (2008)

2008
[31]

In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp

Geva, M., Gupta, A., Berant, J.: Injecting numerical reasoning skills into lan- guage models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 946–958 (2020)

2020
[32]

Wallace, E., Wang, Y., Li, S., Singh, S., Gardner, M.: Do nlp models know num- bers? probing numeracy in embeddings. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th Interna- tional Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5307–5315 (2019)

2019
[33]

In: Findings of the Association for Computational Linguistics: ACL 2023, pp

Hsieh, C.-Y., Li, C.-L., Yeh, C.-K., Nakhost, H., Fujii, Y., Ratner, A., Krishna, R., Lee, C.-Y., Pfister, T.: Distilling step-by-step! outperforming larger language models with less training data and smaller model sizes. In: Findings of the Association for Computational Linguistics: ACL 2023, pp. 8003–8017 (2023)

2023
[34]

arXiv preprint arXiv:2402.03300 (2024)

Shao, Z., Wang, P., Zhu, Q., Xu, R., Song, J., Bi, X., Zhang, H., Zhang, M., Li, Y., Wu, Y., et al.: Deepseekmath: Pushing the limits of mathematical reasoning in open language models. arXiv preprint arXiv:2402.03300 (2024)

Pith/arXiv arXiv 2024
[35]

25 In: AI for Accelerated Materials Design-NeurIPS 2024 (2024)

Gonzales, C., Fuemmeler, E., Tadmor, E.B., Martiniani, S., Miret, S.: Benchmark- ing of universal machine learning interatomic potentials for structural relaxation. 25 In: AI for Accelerated Materials Design-NeurIPS 2024 (2024)

2024
[36]

arXiv preprint arXiv:2511.07158 (2025)

Park, H., Walsh, A.: Guiding generative models to uncover diverse and novel crystals via reinforcement learning. arXiv preprint arXiv:2511.07158 (2025)

arXiv 2025
[37]

Scientific data2(1), 150009 (2015)

De Jong, M., Chen, W., Angsten, T., Jain, A., Notestine, R., Gamst, A., Sluiter, M., Krishna Ande, C., Van Der Zwaag, S., Plata, J.J.,et al.: Charting the com- plete elastic properties of inorganic crystalline compounds. Scientific data2(1), 150009 (2015)

2015
[38]

Chemical reviews112(1), 289–320 (2012)

Cohen, A.J., Mori-S´ anchez, P., Yang, W.: Challenges for density functional theory. Chemical reviews112(1), 289–320 (2012)

2012
[39]

arXiv preprint arXiv:2308.14920 (2023)

Riebesell, J., Goodall, R.E., Benner, P., Chiang, Y., Deng, B., Lee, A.A., Jain, A., Persson, K.A.: Matbench discovery–a framework to evaluate machine learning crystal stability predictions. arXiv preprint arXiv:2308.14920 (2023)

arXiv 2023
[40]

arXiv preprint arXiv:2401.079504(2024)

Zhang, D., Hu, Z., Zhoubian, S., Du, Z., Yang, K., Wang, Z., Yue, Y., Dong, Y., Tang, J.: Sciglm: Training scientific language models with self-reflective instruction annotation and tuning. arXiv preprint arXiv:2401.079504(2024)

arXiv 2024
[41]

Journal of machine learning research9(11) (2008)

Maaten, L., Hinton, G.: Visualizing data using t-sne. Journal of machine learning research9(11) (2008)

2008
[42]

Science advances4(7), 7885 (2018)

Popova, M., Isayev, O., Tropsha, A.: Deep reinforcement learning for de novo drug design. Science advances4(7), 7885 (2018)

2018
[43]

Nature energy 1(9), 1–4 (2016)

Janek, J., Zeier, W.G.: A solid future for battery development. Nature energy 1(9), 1–4 (2016)

2016
[44]

Materials Project database

Ti2O3 (mp-458) Materials Project Entry. Materials Project database. Accessed: 2026-06. https://materialsproject.org/materials/mp-458

2026
[45]

Journal of Solid State Chemistry9(3), 255–260 (1974) https://doi.org/10.1016/0022-4596(74)90082-6

Robinson, W.R.: The crystal structures of ti2o3, a semiconductor, and (ti0.900v0.100)2o3, a semimetal. Journal of Solid State Chemistry9(3), 255–260 (1974) https://doi.org/10.1016/0022-4596(74)90082-6

work page doi:10.1016/0022-4596(74)90082-6 1974
[46]

Science361(6400), 360–365 (2018)

Sanchez-Lengeling, B., Aspuru-Guzik, A.: Inverse molecular design using machine learning: Generative models for matter engineering. Science361(6400), 360–365 (2018)

2018
[47]

arXiv preprint arXiv:2110.06197 (2021)

Xie, T., Fu, X., Ganea, O.-E., Barzilay, R., Jaakkola, T.: Crystal diffu- sion variational autoencoder for periodic material generation. arXiv preprint arXiv:2110.06197 (2021)

arXiv 2021
[48]

Nature materials 12(3), 191–201 (2013) 26

Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) 26

2013
[49]

Scientific reports8(1), 14794 (2018)

Lee, M., Youn, Y., Yim, K., Han, S.: High-throughput ab initio calculations on dielectric constant and band gap of non-oxide dielectrics. Scientific reports8(1), 14794 (2018)

2018
[50]

Journal of Fluorine Chemistry132(12), 1165–1173 (2011)

Stevenson, A.J., Serier-Brault, H., Gredin, P., Mortier, M.: Fluoride materials for optical applications: Single crystals, ceramics, glasses, and glass–ceramics. Journal of Fluorine Chemistry132(12), 1165–1173 (2011)

2011
[51]

Physica B: Condensed Matter591, 412240 (2020)

Ahmed, S., Shakil, M., Zafar, M., Zeba, I., Ahmad, R., Gillani, S.: Theoretical investigation of structural, mechanical, electronic and thermal behavior of plat- inum group metals and their intermetallic alloys ptrhx (x= pd, ir, os, ru). Physica B: Condensed Matter591, 412240 (2020)

2020
[52]

Journal of materials Science33(1), 167–171 (1998)

Serebrinsky, S., Gervasoni, J., Abriata, J., Ponce, V.: Characterization of the electronic density of metals in terms of the bulk modulus. Journal of materials Science33(1), 167–171 (1998)

1998
[53]

Advanced materials23(7), 821–842 (2011)

Gutfleisch, O., Willard, M.A., Br¨ uck, E., Chen, C.H., Sankar, S., Liu, J.P.: Mag- netic materials and devices for the 21st century: stronger, lighter, and more energy efficient. Advanced materials23(7), 821–842 (2011)

2011
[54]

Journal of Physics D: Applied Physics40(9), 149–177 (2007)

Richter, H.J.: The transition from longitudinal to perpendicular recording. Journal of Physics D: Applied Physics40(9), 149–177 (2007)

2007
[55]

science294(5546), 1488–1495 (2001)

Wolf, S.A., Awschalom, D.D., Buhrman, R.A., Daughton, J., Moln´ ar, v.S., Roukes, M.L., Chtchelkanova, A.Y., Treger, D.M.: Spintronics: a spin-based electronics vision for the future. science294(5546), 1488–1495 (2001)

2001
[56]

Electronic Structure 3(3), 033001 (2021)

Zhang, H.: High-throughput design of magnetic materials. Electronic Structure 3(3), 033001 (2021)

2021
[57]

npj Computational Materials10(1), 144 (2024)

Omee, S.S., Fu, N., Dong, R., Hu, M., Hu, J.: Structure-based out-of-distribution (ood) materials property prediction: a benchmark study. npj Computational Materials10(1), 144 (2024)

2024
[58]

In: International Conference on Machine Learning, pp

Yang, Y., Zha, K., Chen, Y., Wang, H., Katabi, D.: Delving into deep imbalanced regression. In: International Conference on Machine Learning, pp. 11842–11851 (2021). PMLR

2021
[59]

arXiv preprint arXiv:2511.03112 (2025)

Chen, J., Guo, J., Fako, E., Schwaller, P.: Accelerating inverse materials design using generative diffusion models with reinforcement learning. arXiv preprint arXiv:2511.03112 (2025)

arXiv 2025
[60]

Journal of Applied Physics38(3), 1001–1002 (1967) 27

Strnat, K., Hoffer, G., Olson, J., Ostertag, W., Becker, J.: A family of new cobalt- base permanent magnet materials. Journal of Applied Physics38(3), 1001–1002 (1967) 27

1967
[61]

IEEE transactions on Magnetics20(5), 1584–1589 (1984)

Sagawa, M., Fujimura, S., Yamamoto, H., Matsuura, Y., Hiraga, K.: Permanent magnet materials based on the rare earth-iron-boron tetragonal compounds. IEEE transactions on Magnetics20(5), 1584–1589 (1984)

1984
[62]

IEEE Transactions on mag- netics47(12), 4671–4681 (2011)

Coey, J.: Hard magnetic materials: A perspective. IEEE Transactions on mag- netics47(12), 4671–4681 (2011)

2011
[63]

Physica B: Condensed Matter172(1-2), 95–100 (1991)

Brooks, M., Nordstr¨ om, L., Johansson, B.: Rare-earth transition-metal inter- metallics. Physica B: Condensed Matter172(1-2), 95–100 (1991)

1991
[64]

npj Computational Materials5(1), 21 (2019)

Lookman, T., Balachandran, P.V., Xue, D., Yuan, R.: Active learning in materi- als science with emphasis on adaptive sampling using uncertainties for targeted design. npj Computational Materials5(1), 21 (2019)

2019
[65]

Nature624(7990), 80–85 (2023)

Merchant, A., Batzner, S., Schoenholz, S.S., Aykol, M., Cheon, G., Cubuk, E.D.: Scaling deep learning for materials discovery. Nature624(7990), 80–85 (2023)

2023
[66]

Physical Review Materials 6(3), 033801 (2022)

Ekstr¨ om Kelvinius, F., Armiento, R., Lindsten, F.: Graph-based machine learning beyond stable materials and relaxed crystal structures. Physical Review Materials 6(3), 033801 (2022)

2022
[67]

MRS Communications9(3), 874–881 (2019)

Ganose, A.M., Jain, A.: Robocrystallographer: automated crystal structure text descriptions and analysis. MRS Communications9(3), 874–881 (2019)

2019
[68]

In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (volume 1: Long Papers), pp

Wang, Y., Kordi, Y., Mishra, S., Liu, A., Smith, N.A., Khashabi, D., Hajishirzi, H.: Self-instruct: Aligning language models with self-generated instructions. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (volume 1: Long Papers), pp. 13484–13508 (2023)

2023
[69]

Nature Communications (2026)

Wu, L., Huang, W., Jiao, R., Huang, J., Liu, L., Zhou, Y., Sun, H., Liu, Y., Sun, F., Ren, Y., et al.: Siamese foundation models for crystal structure prediction. Nature Communications (2026)

2026
[70]

Wiley Online Library (2016)

Brock, C.P., Hahn, T., Wondratschek, H., M¨ uller, U., Shmueli, U., Prince, E., Authier, A., Kopsk` y, V., Litvin, D., Arnold, E., et al.: International tables for crystallography volume A: Space-group symmetry. Wiley Online Library (2016)

2016
[71]

Science Bulletin (2025)

Cao, Z., Luo, X., Lv, J., Wang, L.: Space group informed transformer for crystalline materials generation. Science Bulletin (2025)

2025
[72]

Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)

2018
[73]

International Series in Operations Research & Management Science, vol

Miettinen, K.: Nonlinear Multiobjective Optimization. International Series in Operations Research & Management Science, vol. 12. Springer, New York, NY (1999). https://doi.org/10.1007/978-1-4615-5563-6

work page doi:10.1007/978-1-4615-5563-6 1999
[74]

Advances in neural information processing systems35, 27730–27744 (2022)

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, 28 C., Agarwal, S., Slama, K., Ray, A.,et al.: Training language models to follow instructions with human feedback. Advances in neural information processing systems35, 27730–27744 (2022)

2022
[75]

arXiv preprint arXiv:1707.06347 (2017)

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

Pith/arXiv arXiv 2017
[76]

John Wiley & Sons, Hoboken, NJ (2022)

West, A.R.: Solid State Chemistry and Its Applications, 2nd edn. John Wiley & Sons, Hoboken, NJ (2022)

2022
[77]

Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E

Batzner, S., Musaelian, A., Sun, L., Geiger, M., Mailoa, J.P., Kornbluth, M., Molinari, N., Smidt, T.E., Kozinsky, B.: E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature Communications 13(1) (2022) https://doi.org/10.1038/s41467-022-29939-5

work page doi:10.1038/s41467-022-29939-5 2022
[78]

arXiv preprint arXiv:2410.12771 (2024)

Barroso-Luque, L., Shuaibi, M., Fu, X., Wood, B.M., Dzamba, M., Gao, M., Rizvi, A., Zitnick, C.L., Ulissi, Z.W.: Open materials 2024 (omat24) inorganic materials dataset and models. arXiv preprint arXiv:2410.12771 (2024)

Pith/arXiv arXiv 2024
[79]

Computational Materials Science152, 60–69 (2018)

Ward, L., Dunn, A., Faghaninia, A., Zimmermann, N.E., Bajaj, S., Wang, Q., Montoya, J., Chen, J., Bystrom, K., Dylla, M.,et al.: Matminer: An open source toolkit for materials data mining. Computational Materials Science152, 60–69 (2018)

2018
[80]

npj Computational Materials2(1), 16028 (2016)

Ward, L., Agrawal, A., Choudhary, A., Wolverton, C.: A general-purpose machine learning framework for predicting properties of inorganic materials. npj Computational Materials2(1), 16028 (2016)

2016

Showing first 80 references.

[1] [1]

Machine Learning: Science and Technology6(3), 030701 (2025)

Zimmermann, Y., Bazgir, A., Al-Feghali, A., Ansari, M., Bocarsly, J., Brinson, 22 L.C., Chiang, Y., Circi, D., Chiu, M.-H., Daelman, N.,et al.: 32 examples of llm applications in materials science and chemistry: towards automation, assis- tants, agents, and accelerated scientific discovery. Machine Learning: Science and Technology6(3), 030701 (2025)

2025

[2] [2]

Npj Computational Materials11(1), 61 (2025)

Pyzer-Knapp, E.O., Manica, M., Staar, P., Morin, L., Ruch, P., Laino, T., Smith, J.R., Curioni, A.: Foundation models for materials discovery–current state and future directions. Npj Computational Materials11(1), 61 (2025)

2025

[3] [3]

arXiv preprint arXiv:2305.05708 (2023)

Flam-Shepherd, D., Aspuru-Guzik, A.: Language models can generate molecules, materials, and protein binding sites directly in three dimensions as xyz, cif, and pdb files. arXiv preprint arXiv:2305.05708 (2023)

arXiv 2023

[4] [4]

Alampara, N., Miret, S., Jablonka, K.M.: Mattext: Do language models need more than text & scale for materials modeling? arXiv preprint arXiv:2406.17295 (2024)

arXiv 2024

[5] [5]

Iscience24(3) (2021)

Kononova, O., He, T., Huo, H., Trewartha, A., Olivetti, E.A., Ceder, G.: Oppor- tunities and challenges of text mining in materials research. Iscience24(3) (2021)

2021

[6] [6]

Beltagy, I., Lo, K., Cohan, A.: Scibert: A pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natu- ral Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3615–3620 (2019)

2019

[7] [7]

npj Computational Materials11(1), 186 (2025)

Niyongabo Rubungo, A., Arnold, C., Rand, B.P., Dieng, A.B.: Llm-prop: pre- dicting the properties of crystalline materials using large language models. npj Computational Materials11(1), 186 (2025)

2025

[8] [8]

Nature Communications15(1), 10570 (2024)

Antunes, L.M., Butler, K.T., Grau-Crespo, R.: Crystal structure generation with autoregressive large language modeling. Nature Communications15(1), 10570 (2024)

2024

[9] [9]

Available at SSRN 3950755 (2021)

Walker, N., Trewartha, A., Huo, H., Lee, S., Cruse, K., Dagdelen, J., Dunn, A., Persson, K., Ceder, G., Jain, A.: The impact of domain-specific pre-training on named entity recognition tasks in materials science. Available at SSRN 3950755 (2021)

2021

[10] [10]

Park, H., Li, Z., Walsh, A.: Has generative artificial intelligence solved inverse materials design? Matter7(7), 2355–2367 (2024)

2024

[11] [11]

Polymer Journal54(8), 957–967 (2022)

Amamoto, Y.: Data-driven approaches for structure-property relationships in polymer science for prediction and understanding. Polymer Journal54(8), 957–967 (2022)

2022

[12] [12]

arXiv preprint arXiv:2605.14344 (2026)

Wu, Y., Falletta, S., McGrath, D., Yang, S.: Crystalreasoner: Reasoning 23 and rl for property-conditioned crystal structure generation. arXiv preprint arXiv:2605.14344 (2026)

Pith/arXiv arXiv 2026

[13] [13]

arXiv preprint arXiv:2504.02367 (2025)

Cao, Z., Wang, L.: Crystalformer-rl: Reinforcement fine-tuning for materials design. arXiv preprint arXiv:2504.02367 (2025)

arXiv 2025

[14] [14]

npj Computational Materials10(1), 287 (2024)

Karpovich, C., Pan, E., Olivetti, E.A.: Deep reinforcement learning for inverse inorganic materials design. npj Computational Materials10(1), 287 (2024)

2024

[15] [15]

arXiv preprint arXiv:2512.04562 (2025)

Betala, S., Gleason, S.P., Ramlaoui, A., Xu, A., Channing, G., Levy, D., Fourrier, C., Kazeev, N., Joshi, C.K., Kaba, S.-O., et al.: Lemat-genbench: A unified eval- uation framework for crystal generative models. arXiv preprint arXiv:2512.04562 (2025)

arXiv 2025

[16] [16]

In: Uncertainty in Artificial Intelligence, pp

Das, K., Goyal, P., Lee, S.-C., Bhattacharjee, S., Ganguly, N.: Crysmmnet: multi- modal representation for crystal property prediction. In: Uncertainty in Artificial Intelligence, pp. 507–517 (2023). PMLR

2023

[17] [17]

Physical review letters 120(14), 145301 (2018)

Xie, T., Grossman, J.C.: Crystal graph convolutional neural networks for an accu- rate and interpretable prediction of material properties. Physical review letters 120(14), 145301 (2018)

2018

[18] [18]

Nature Computational Science2(11), 718–728 (2022)

Chen, C., Ong, S.P.: A universal graph deep learning interatomic potential for the periodic table. Nature Computational Science2(11), 718–728 (2022)

2022

[19] [19]

Nature639(8055), 624–632 (2025)

Zeni, C., Pinsler, R., Z¨ ugner, D., Fowler, A., Horton, M., Fu, X., Wang, Z., Shysheya, A., Crabb´ e, J., Ueda, S.,et al.: A generative model for inorganic materials design. Nature639(8055), 624–632 (2025)

2025

[20] [20]

Advances in Neural Information Processing Systems36, 17464–17497 (2023)

Jiao, R., Huang, W., Lin, P., Han, J., Chen, P., Lu, Y., Liu, Y.: Crystal struc- ture prediction by joint equivariant diffusion. Advances in Neural Information Processing Systems36, 17464–17497 (2023)

2023

[21] [21]

Current Opinion in Chemical Engineering36, 100778 (2022) https: //doi.org/10.1016/j.coche.2021.100778

Nandy, A., Duan, C., Kulik, H.J.: Audacity of huge: overcoming challenges of data scarcity and data quality for machine learning in computational materials discovery. Current Opinion in Chemical Engineering36, 100778 (2022) https: //doi.org/10.1016/j.coche.2021.100778

work page doi:10.1016/j.coche.2021.100778 2022

[22] [22]

Hugging Face

ScienceOne-AI: S1-Base: Scientific Foundation Model. Hugging Face. Accessed: 2025 (2025)

2025

[23] [23]

Scientific data5(1), 180062 (2018)

Ghahremanpour, M.M., Van Maaren, P.J., Van Der Spoel, D.: The alexan- dria library, a quantum-chemical database of molecular properties for force field development. Scientific data5(1), 180062 (2018)

2018

[24] [24]

Foundations of Crystallography 24 47(6), 655–685 (1991)

Hall, S.R., Allen, F.H., Brown, I.D.: The crystallographic information file (cif): a new standard archive file for crystallography. Foundations of Crystallography 24 47(6), 655–685 (1991)

1991

[25] [25]

(ed.): International Tables for Crystallography, Volume A: Space- Group Symmetry, 6th edn

Aroyo, M.I. (ed.): International Tables for Crystallography, Volume A: Space- Group Symmetry, 6th edn. Springer, Dordrecht (2016). https://doi.org/10.1107/ 97809553602060000114

2016

[26] [26]

Gordon and Breach Science Publishers, New York (1965)

Kovalev, O.V.: The Analytical Expression of the Results of the Theory of Space Groups. Gordon and Breach Science Publishers, New York (1965)

1965

[27] [27]

APL materials 1(1) (2013)

Jain, A., Ong, S.P., Hautier, G., Chen, W., Richards, W.D., Dacek, S., Cholia, S., Gunter, D., Skinner, D., Ceder, G., et al.: Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL materials 1(1) (2013)

2013

[28] [28]

arXiv preprint arXiv:2412.09560 (2024)

Mishra, V., Singh, S., Ahlawat, D., Zaki, M., Bihani, V., Grover, H.S., Mishra, B., Miret, S., Krishnan, N., et al.: Foundational large language models for materials research. arXiv preprint arXiv:2412.09560 (2024)

arXiv 2024

[29] [29]

Advances in neural information processing systems35, 24824–24837 (2022)

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., Zhou, D.,et al.: Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems35, 24824–24837 (2022)

2022

[30] [30]

In: Proceedings of the 25th International Conference on Machine Learning, pp

Collobert, R., Weston, J.: A unified architecture for natural language process- ing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167 (2008)

2008

[31] [31]

In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp

Geva, M., Gupta, A., Berant, J.: Injecting numerical reasoning skills into lan- guage models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 946–958 (2020)

2020

[32] [32]

Wallace, E., Wang, Y., Li, S., Singh, S., Gardner, M.: Do nlp models know num- bers? probing numeracy in embeddings. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th Interna- tional Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5307–5315 (2019)

2019

[33] [33]

In: Findings of the Association for Computational Linguistics: ACL 2023, pp

Hsieh, C.-Y., Li, C.-L., Yeh, C.-K., Nakhost, H., Fujii, Y., Ratner, A., Krishna, R., Lee, C.-Y., Pfister, T.: Distilling step-by-step! outperforming larger language models with less training data and smaller model sizes. In: Findings of the Association for Computational Linguistics: ACL 2023, pp. 8003–8017 (2023)

2023

[34] [34]

arXiv preprint arXiv:2402.03300 (2024)

Shao, Z., Wang, P., Zhu, Q., Xu, R., Song, J., Bi, X., Zhang, H., Zhang, M., Li, Y., Wu, Y., et al.: Deepseekmath: Pushing the limits of mathematical reasoning in open language models. arXiv preprint arXiv:2402.03300 (2024)

Pith/arXiv arXiv 2024

[35] [35]

25 In: AI for Accelerated Materials Design-NeurIPS 2024 (2024)

Gonzales, C., Fuemmeler, E., Tadmor, E.B., Martiniani, S., Miret, S.: Benchmark- ing of universal machine learning interatomic potentials for structural relaxation. 25 In: AI for Accelerated Materials Design-NeurIPS 2024 (2024)

2024

[36] [36]

arXiv preprint arXiv:2511.07158 (2025)

Park, H., Walsh, A.: Guiding generative models to uncover diverse and novel crystals via reinforcement learning. arXiv preprint arXiv:2511.07158 (2025)

arXiv 2025

[37] [37]

Scientific data2(1), 150009 (2015)

De Jong, M., Chen, W., Angsten, T., Jain, A., Notestine, R., Gamst, A., Sluiter, M., Krishna Ande, C., Van Der Zwaag, S., Plata, J.J.,et al.: Charting the com- plete elastic properties of inorganic crystalline compounds. Scientific data2(1), 150009 (2015)

2015

[38] [38]

Chemical reviews112(1), 289–320 (2012)

Cohen, A.J., Mori-S´ anchez, P., Yang, W.: Challenges for density functional theory. Chemical reviews112(1), 289–320 (2012)

2012

[39] [39]

arXiv preprint arXiv:2308.14920 (2023)

Riebesell, J., Goodall, R.E., Benner, P., Chiang, Y., Deng, B., Lee, A.A., Jain, A., Persson, K.A.: Matbench discovery–a framework to evaluate machine learning crystal stability predictions. arXiv preprint arXiv:2308.14920 (2023)

arXiv 2023

[40] [40]

arXiv preprint arXiv:2401.079504(2024)

Zhang, D., Hu, Z., Zhoubian, S., Du, Z., Yang, K., Wang, Z., Yue, Y., Dong, Y., Tang, J.: Sciglm: Training scientific language models with self-reflective instruction annotation and tuning. arXiv preprint arXiv:2401.079504(2024)

arXiv 2024

[41] [41]

Journal of machine learning research9(11) (2008)

Maaten, L., Hinton, G.: Visualizing data using t-sne. Journal of machine learning research9(11) (2008)

2008

[42] [42]

Science advances4(7), 7885 (2018)

Popova, M., Isayev, O., Tropsha, A.: Deep reinforcement learning for de novo drug design. Science advances4(7), 7885 (2018)

2018

[43] [43]

Nature energy 1(9), 1–4 (2016)

Janek, J., Zeier, W.G.: A solid future for battery development. Nature energy 1(9), 1–4 (2016)

2016

[44] [44]

Materials Project database

Ti2O3 (mp-458) Materials Project Entry. Materials Project database. Accessed: 2026-06. https://materialsproject.org/materials/mp-458

2026

[45] [45]

Journal of Solid State Chemistry9(3), 255–260 (1974) https://doi.org/10.1016/0022-4596(74)90082-6

Robinson, W.R.: The crystal structures of ti2o3, a semiconductor, and (ti0.900v0.100)2o3, a semimetal. Journal of Solid State Chemistry9(3), 255–260 (1974) https://doi.org/10.1016/0022-4596(74)90082-6

work page doi:10.1016/0022-4596(74)90082-6 1974

[46] [46]

Science361(6400), 360–365 (2018)

Sanchez-Lengeling, B., Aspuru-Guzik, A.: Inverse molecular design using machine learning: Generative models for matter engineering. Science361(6400), 360–365 (2018)

2018

[47] [47]

arXiv preprint arXiv:2110.06197 (2021)

Xie, T., Fu, X., Ganea, O.-E., Barzilay, R., Jaakkola, T.: Crystal diffu- sion variational autoencoder for periodic material generation. arXiv preprint arXiv:2110.06197 (2021)

arXiv 2021

[48] [48]

Nature materials 12(3), 191–201 (2013) 26

Curtarolo, S., Hart, G.L., Nardelli, M.B., Mingo, N., Sanvito, S., Levy, O.: The high-throughput highway to computational materials design. Nature materials 12(3), 191–201 (2013) 26

2013

[49] [49]

Scientific reports8(1), 14794 (2018)

Lee, M., Youn, Y., Yim, K., Han, S.: High-throughput ab initio calculations on dielectric constant and band gap of non-oxide dielectrics. Scientific reports8(1), 14794 (2018)

2018

[50] [50]

Journal of Fluorine Chemistry132(12), 1165–1173 (2011)

Stevenson, A.J., Serier-Brault, H., Gredin, P., Mortier, M.: Fluoride materials for optical applications: Single crystals, ceramics, glasses, and glass–ceramics. Journal of Fluorine Chemistry132(12), 1165–1173 (2011)

2011

[51] [51]

Physica B: Condensed Matter591, 412240 (2020)

Ahmed, S., Shakil, M., Zafar, M., Zeba, I., Ahmad, R., Gillani, S.: Theoretical investigation of structural, mechanical, electronic and thermal behavior of plat- inum group metals and their intermetallic alloys ptrhx (x= pd, ir, os, ru). Physica B: Condensed Matter591, 412240 (2020)

2020

[52] [52]

Journal of materials Science33(1), 167–171 (1998)

Serebrinsky, S., Gervasoni, J., Abriata, J., Ponce, V.: Characterization of the electronic density of metals in terms of the bulk modulus. Journal of materials Science33(1), 167–171 (1998)

1998

[53] [53]

Advanced materials23(7), 821–842 (2011)

Gutfleisch, O., Willard, M.A., Br¨ uck, E., Chen, C.H., Sankar, S., Liu, J.P.: Mag- netic materials and devices for the 21st century: stronger, lighter, and more energy efficient. Advanced materials23(7), 821–842 (2011)

2011

[54] [54]

Journal of Physics D: Applied Physics40(9), 149–177 (2007)

Richter, H.J.: The transition from longitudinal to perpendicular recording. Journal of Physics D: Applied Physics40(9), 149–177 (2007)

2007

[55] [55]

science294(5546), 1488–1495 (2001)

Wolf, S.A., Awschalom, D.D., Buhrman, R.A., Daughton, J., Moln´ ar, v.S., Roukes, M.L., Chtchelkanova, A.Y., Treger, D.M.: Spintronics: a spin-based electronics vision for the future. science294(5546), 1488–1495 (2001)

2001

[56] [56]

Electronic Structure 3(3), 033001 (2021)

Zhang, H.: High-throughput design of magnetic materials. Electronic Structure 3(3), 033001 (2021)

2021

[57] [57]

npj Computational Materials10(1), 144 (2024)

Omee, S.S., Fu, N., Dong, R., Hu, M., Hu, J.: Structure-based out-of-distribution (ood) materials property prediction: a benchmark study. npj Computational Materials10(1), 144 (2024)

2024

[58] [58]

In: International Conference on Machine Learning, pp

Yang, Y., Zha, K., Chen, Y., Wang, H., Katabi, D.: Delving into deep imbalanced regression. In: International Conference on Machine Learning, pp. 11842–11851 (2021). PMLR

2021

[59] [59]

arXiv preprint arXiv:2511.03112 (2025)

Chen, J., Guo, J., Fako, E., Schwaller, P.: Accelerating inverse materials design using generative diffusion models with reinforcement learning. arXiv preprint arXiv:2511.03112 (2025)

arXiv 2025

[60] [60]

Journal of Applied Physics38(3), 1001–1002 (1967) 27

Strnat, K., Hoffer, G., Olson, J., Ostertag, W., Becker, J.: A family of new cobalt- base permanent magnet materials. Journal of Applied Physics38(3), 1001–1002 (1967) 27

1967

[61] [61]

IEEE transactions on Magnetics20(5), 1584–1589 (1984)

Sagawa, M., Fujimura, S., Yamamoto, H., Matsuura, Y., Hiraga, K.: Permanent magnet materials based on the rare earth-iron-boron tetragonal compounds. IEEE transactions on Magnetics20(5), 1584–1589 (1984)

1984

[62] [62]

IEEE Transactions on mag- netics47(12), 4671–4681 (2011)

Coey, J.: Hard magnetic materials: A perspective. IEEE Transactions on mag- netics47(12), 4671–4681 (2011)

2011

[63] [63]

Physica B: Condensed Matter172(1-2), 95–100 (1991)

Brooks, M., Nordstr¨ om, L., Johansson, B.: Rare-earth transition-metal inter- metallics. Physica B: Condensed Matter172(1-2), 95–100 (1991)

1991

[64] [64]

npj Computational Materials5(1), 21 (2019)

Lookman, T., Balachandran, P.V., Xue, D., Yuan, R.: Active learning in materi- als science with emphasis on adaptive sampling using uncertainties for targeted design. npj Computational Materials5(1), 21 (2019)

2019

[65] [65]

Nature624(7990), 80–85 (2023)

Merchant, A., Batzner, S., Schoenholz, S.S., Aykol, M., Cheon, G., Cubuk, E.D.: Scaling deep learning for materials discovery. Nature624(7990), 80–85 (2023)

2023

[66] [66]

Physical Review Materials 6(3), 033801 (2022)

Ekstr¨ om Kelvinius, F., Armiento, R., Lindsten, F.: Graph-based machine learning beyond stable materials and relaxed crystal structures. Physical Review Materials 6(3), 033801 (2022)

2022

[67] [67]

MRS Communications9(3), 874–881 (2019)

Ganose, A.M., Jain, A.: Robocrystallographer: automated crystal structure text descriptions and analysis. MRS Communications9(3), 874–881 (2019)

2019

[68] [68]

In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (volume 1: Long Papers), pp

Wang, Y., Kordi, Y., Mishra, S., Liu, A., Smith, N.A., Khashabi, D., Hajishirzi, H.: Self-instruct: Aligning language models with self-generated instructions. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (volume 1: Long Papers), pp. 13484–13508 (2023)

2023

[69] [69]

Nature Communications (2026)

Wu, L., Huang, W., Jiao, R., Huang, J., Liu, L., Zhou, Y., Sun, H., Liu, Y., Sun, F., Ren, Y., et al.: Siamese foundation models for crystal structure prediction. Nature Communications (2026)

2026

[70] [70]

Wiley Online Library (2016)

Brock, C.P., Hahn, T., Wondratschek, H., M¨ uller, U., Shmueli, U., Prince, E., Authier, A., Kopsk` y, V., Litvin, D., Arnold, E., et al.: International tables for crystallography volume A: Space-group symmetry. Wiley Online Library (2016)

2016

[71] [71]

Science Bulletin (2025)

Cao, Z., Luo, X., Lv, J., Wang, L.: Space group informed transformer for crystalline materials generation. Science Bulletin (2025)

2025

[72] [72]

Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)

2018

[73] [73]

International Series in Operations Research & Management Science, vol

Miettinen, K.: Nonlinear Multiobjective Optimization. International Series in Operations Research & Management Science, vol. 12. Springer, New York, NY (1999). https://doi.org/10.1007/978-1-4615-5563-6

work page doi:10.1007/978-1-4615-5563-6 1999

[74] [74]

Advances in neural information processing systems35, 27730–27744 (2022)

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, 28 C., Agarwal, S., Slama, K., Ray, A.,et al.: Training language models to follow instructions with human feedback. Advances in neural information processing systems35, 27730–27744 (2022)

2022

[75] [75]

arXiv preprint arXiv:1707.06347 (2017)

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

Pith/arXiv arXiv 2017

[76] [76]

John Wiley & Sons, Hoboken, NJ (2022)

West, A.R.: Solid State Chemistry and Its Applications, 2nd edn. John Wiley & Sons, Hoboken, NJ (2022)

2022

[77] [77]

Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E

Batzner, S., Musaelian, A., Sun, L., Geiger, M., Mailoa, J.P., Kornbluth, M., Molinari, N., Smidt, T.E., Kozinsky, B.: E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature Communications 13(1) (2022) https://doi.org/10.1038/s41467-022-29939-5

work page doi:10.1038/s41467-022-29939-5 2022

[78] [78]

arXiv preprint arXiv:2410.12771 (2024)

Barroso-Luque, L., Shuaibi, M., Fu, X., Wood, B.M., Dzamba, M., Gao, M., Rizvi, A., Zitnick, C.L., Ulissi, Z.W.: Open materials 2024 (omat24) inorganic materials dataset and models. arXiv preprint arXiv:2410.12771 (2024)

Pith/arXiv arXiv 2024

[79] [79]

Computational Materials Science152, 60–69 (2018)

Ward, L., Dunn, A., Faghaninia, A., Zimmermann, N.E., Bajaj, S., Wang, Q., Montoya, J., Chen, J., Bystrom, K., Dylla, M.,et al.: Matminer: An open source toolkit for materials data mining. Computational Materials Science152, 60–69 (2018)

2018

[80] [80]

npj Computational Materials2(1), 16028 (2016)

Ward, L., Agrawal, A., Choudhary, A., Wolverton, C.: A general-purpose machine learning framework for predicting properties of inorganic materials. npj Computational Materials2(1), 16028 (2016)

2016