arxiv: 2605.10458 · v1 · submitted 2026-05-11 · 💻 cs.LG · cond-mat.mtrl-sci· physics.chem-ph

Recognition: no theorem link

QT-Net: Rethinking Evaluation of AI Models in Atomic Chemical Space

Marisa Gliege, Martin Rahm, Pablo Mart\'inez Crespo, Richard Beckmann, Robert S. Jordan, Roc\'io Mercado, Santiago Miret, Stefano Ribes, Vijay Kris Narasimhan

Authors on Pith no claims yet

Pith reviewed 2026-05-12 04:02 UTC · model grok-4.3

classification 💻 cs.LG cond-mat.mtrl-sciphysics.chem-ph

keywords atomic propertiesgraph neural networksout-of-distribution evaluationmolecular property predictionpartial chargesmultipolesdipole moments

0 comments

The pith

QT-Net predicts atomic properties for unseen environments and uses them to improve molecular forecasts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to improve how we test machine learning models that predict properties of individual atoms inside molecules. It introduces an evaluation method that groups similar atomic surroundings and holds out entire groups to check if models can handle truly new cases. This leads to a comparison of different neural network types and the creation of QT-Net, which learns to output atomic electron counts and multipoles. The results show these outputs help predict properties of entire molecules better and can be summed up to match actual molecular dipole moments.

Core claim

QT-Net, a rotationally augmented non-equivariant graph neural network, is shown to predict electron populations and multipoles for atoms in molecules from QM9 that lie outside the training clusters. These per-atom inferences, when fed as features into other models, improve performance on molecular property prediction tasks. Additionally, summing the atomic contributions from QT-Net recovers the ground-truth molecular dipole moments reported in the QM9 dataset.

What carries the argument

QT-Net, a rotationally augmented non-equivariant graph neural network that outputs per-atom electron populations and multipoles.

If this is right

QT-Net inferred atomic properties serve as useful input features that enhance downstream molecular property prediction.
Molecular dipole moments computed by aggregating QT-Net's atomic outputs match the ground-truth values in QM9.
The SOAP clustering protocol enables rigorous out-of-distribution testing at the atomic level for model comparisons.
Rotationally augmented non-equivariant models perform competitively with equivariant ones under this evaluation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This suggests that atomic-scale predictions can act as chemical priors to reduce data requirements in broader molecular modeling.
Applying the same clustering evaluation to other datasets could reveal limitations in current atomic property predictors.
QT-Net's approach might extend to predicting other atomic-level quantities like energies or forces for simulation tasks.

Load-bearing premise

That grouping atomic environments by SOAP descriptors creates a meaningful out-of-distribution test where unseen clusters represent genuinely novel atomic settings.

What would settle it

A calculation where molecular dipole moments derived from QT-Net per-atom outputs deviate substantially from QM9 ground truth, or where adding QT-Net features fails to improve molecular property prediction accuracy.

Figures

Figures reproduced from arXiv: 2605.10458 by Marisa Gliege, Martin Rahm, Pablo Mart\'inez Crespo, Richard Beckmann, Robert S. Jordan, Roc\'io Mercado, Santiago Miret, Stefano Ribes, Vijay Kris Narasimhan.

**Figure 2.** Figure 2: Sample molecules with held out atomic environments. From left to right: [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Schematic figure of a layer of message passing of QT-Net. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: R2 scores of ensembles of informed and blind models trained on different fractions of the data in AIMEl, when deployed on the remainder of QM9. Notice that these are not learning curves. The plots in Figures 4 and 12 show that, even when trained on fewer than 300 molecules, both informed and uninformed ensembles achieve R2 > 0.7 when deployed on the remainder of QM9. We notice that both ensembles perform o… view at source ↗

**Figure 5.** Figure 5: Parity plot for the molecular dipole moments on the remainder of QM9, computed from the [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Number of atoms in each atomic environment. Only held out environment labels are [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

**Figure 7.** Figure 7: Number of molecules where an atomic environment occurs in the whole AIMEl subset. [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: Tukey forest plots for CCC scores of each model, for each atom and property in the holdout [PITH_FULL_IMAGE:figures/full_fig_p019_8.png] view at source ↗

**Figure 9.** Figure 9: Tukey forest plots for CCC scores of each model, for each atom and property for all of the [PITH_FULL_IMAGE:figures/full_fig_p020_9.png] view at source ↗

**Figure 10.** Figure 10: Distributions of fractions of atoms in training by minimum number of neighbors at each [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗

**Figure 11.** Figure 11: Value of R2 of predictions on AIMEl molecules for each property (α, ∆ε, U0, Cv) at each training fraction. 0 100 200 Target [a 3 0 ] 0 50 100 150 200 Predictions Blind Informed 0.2 0.4 Target [Ha] 0.1 0.2 0.3 0.4 400 200 Target U0 [Ha] 400 200 20 40 Target Cv [cal/(mol·K)] 10 20 30 40 [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗

**Figure 12.** Figure 12: Parity plots for ensemble predictions of informed and blind models trained on 0.01 fraction [PITH_FULL_IMAGE:figures/full_fig_p024_12.png] view at source ↗

**Figure 13.** Figure 13: Parity plot for the OOD predictions of QT-Net for electron populations. [PITH_FULL_IMAGE:figures/full_fig_p027_13.png] view at source ↗

**Figure 14.** Figure 14: Parity plot for the OOD predictions of QT-Net for localization indices. [PITH_FULL_IMAGE:figures/full_fig_p027_14.png] view at source ↗

**Figure 15.** Figure 15: Parity plot for the OOD predictions of QT-Net for atomic contributions to the molecular [PITH_FULL_IMAGE:figures/full_fig_p028_15.png] view at source ↗

**Figure 16.** Figure 16: Parity plot for the OOD predictions of QT-Net for atomic quadrupole moments. [PITH_FULL_IMAGE:figures/full_fig_p028_16.png] view at source ↗

read the original abstract

Atomic properties such as partial charges or multipoles encode chemically meaningful information that can inform downstream molecular property prediction, but their evaluation as machine learning targets has been complicated by the absence of a principled out-of-distribution evaluation protocol at the atomic level. In this work, we propose a held-out evaluation protocol that clusters atomic environments by SOAP descriptors and computes metrics accounting only for cluster labels unseen during training. Following this procedure, we use 5$\times$5 cross-validation and Tukey's HSD to run a statistically rigorous comparison of E(3)-equivariant against non-equivariant, rotationally augmented models for predicting electron populations and multipoles of H, C, N, and O atoms. Building on our results, we introduce the Quantum Topological Neural Network (QT-Net), a rotationally augmented, non-equivariant graph neural network. We show that QT-Net can be used to infer properties of atoms in molecules from QM9 outside our training set, and that these inferred properties can yield improvement when used as input features for downstream molecular property prediction. To further validate the framework, molecular dipole moments computed from QT-Net's per-atom outputs recover the ground-truth values reported in QM9. We release all code and data, including a JAX implementation of QT-Net, to support the broader use of learned QTA properties as inductive biases for atomic-scale molecular machine learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

QT-Net and the SOAP clustering protocol give a workable atomic OOD test that shows some downstream gains and dipole recovery, but the splits allow enough chemical similarity leakage to soften the generalization story.

read the letter

The main thing to know is that they built a held-out test by clustering atoms with SOAP descriptors and only scoring on unseen clusters, then used that to show QT-Net beating equivariant baselines on QM9 while also lifting molecular property prediction when the per-atom outputs are added as features. The dipole moments reconstructed from those outputs match the QM9 ground truth, which is a clean independent check. They release the JAX code too, which helps anyone who wants to try it.

Referee Report

2 major / 2 minor

Summary. The paper proposes a held-out evaluation protocol for atomic property prediction models that clusters per-atom SOAP descriptors and evaluates only on unseen cluster labels. Using 5×5 cross-validation and Tukey's HSD tests, it compares E(3)-equivariant and non-equivariant (rotationally augmented) models for predicting electron populations and multipoles of H, C, N, O atoms in QM9. It introduces QT-Net, a rotationally augmented non-equivariant GNN, shows that its inferences on atoms from molecules outside the training clusters improve downstream molecular property prediction, and validates the framework by recovering ground-truth QM9 dipole moments from the per-atom outputs. All code and data, including a JAX implementation, are released.

Significance. If the central claims hold, the work is significant for introducing a statistically grounded atomic-level OOD protocol that moves beyond molecule-level splits, for demonstrating that learned QTA properties can serve as useful inductive biases, and for the independent dipole-moment recovery check. The open release of code and data is a clear strength that supports reproducibility and broader adoption in atomic-scale molecular ML.

major comments (2)

[Evaluation protocol and clustering procedure] The held-out protocol (described in the evaluation section) clusters atomic environments by SOAP descriptors and treats unseen cluster labels as OOD. However, because SOAP is a strictly local descriptor, atoms in chemically similar bonding motifs (e.g., sp2 carbons across different QM9 molecules) can receive high similarity scores yet be placed in different clusters. This risks chemical similarity leakage across the train/test boundary and undermines the claim that QT-Net inferences are performed on genuinely unseen atomic environments. The paper should add quantitative checks (e.g., inter-cluster chemical diversity metrics or molecule-level blocking) to confirm that the splits achieve the intended separation.
[Downstream molecular property prediction experiments] The downstream improvement claim (reported in the results on molecular property prediction) relies on the OOD inferences being meaningful. If the clustering leakage concern is not addressed, the reported gains when using QT-Net outputs as input features cannot be confidently attributed to true generalization rather than residual training-set correlation. An ablation that compares against a random-cluster baseline or a molecule-level split would strengthen this result.

minor comments (2)

[Dipole moment validation] The abstract states that dipole moments are recovered from per-atom outputs, but the corresponding results section would benefit from an explicit equation showing how the vector sum is formed and any assumptions about charge neutrality.
[Background on atomic properties] Notation for the electron population and multipole targets is introduced without a clear reference to the underlying QTAIM definitions; adding a short paragraph or citation in the background section would improve accessibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the opportunity to respond to the referee's report. We appreciate the constructive feedback on our evaluation protocol and downstream experiments. Below, we provide point-by-point responses to the major comments and indicate the revisions we plan to incorporate.

read point-by-point responses

Referee: [Evaluation protocol and clustering procedure] The held-out protocol (described in the evaluation section) clusters atomic environments by SOAP descriptors and treats unseen cluster labels as OOD. However, because SOAP is a strictly local descriptor, atoms in chemically similar bonding motifs (e.g., sp2 carbons across different QM9 molecules) can receive high similarity scores yet be placed in different clusters. This risks chemical similarity leakage across the train/test boundary and undermines the claim that QT-Net inferences are performed on genuinely unseen atomic environments. The paper should add quantitative checks (e.g., inter-cluster chemical diversity metrics or molecule-level blocking) to confirm that the splits achieve the intended separation.

Authors: We agree that SOAP descriptors are local and that clustering algorithms may place chemically related atoms into separate clusters when their descriptor vectors differ sufficiently. The protocol defines OOD strictly via unseen cluster labels rather than claiming perfect chemical isolation. To address the concern directly, we will add quantitative checks in the revised manuscript: average inter-cluster SOAP cosine similarities, distributions of bond orders and atomic coordination numbers across clusters, and an analysis of molecule-level overlap. These metrics will quantify the achieved separation and support the claim that test atoms represent environments outside the training distribution in descriptor space. revision: yes
Referee: [Downstream molecular property prediction experiments] The downstream improvement claim (reported in the results on molecular property prediction) relies on the OOD inferences being meaningful. If the clustering leakage concern is not addressed, the reported gains when using QT-Net outputs as input features cannot be confidently attributed to true generalization rather than residual training-set correlation. An ablation that compares against a random-cluster baseline or a molecule-level split would strengthen this result.

Authors: We concur that linking downstream gains to true OOD generalization requires additional controls. In the revision we will include an ablation that replaces the SOAP-based clustering with random cluster assignments while preserving the same held-out fraction, allowing direct comparison of whether the descriptor-driven splits produce distinct benefits. We will also report atomic-property prediction and downstream results under a conventional molecule-level train/test split. These experiments will help isolate the contribution of the learned QTA properties from any residual correlations. revision: yes

Circularity Check

0 steps flagged

No significant circularity in QT-Net derivation

full rationale

The paper defines an atomic-level OOD protocol via SOAP clustering and 5x5 CV, trains QT-Net on electron populations/multipoles, then validates via downstream molecular property gains and independent recovery of QM9 dipole moments from per-atom outputs. These steps rely on external QM9 benchmarks and empirical comparisons rather than reducing to fitted inputs or self-definitions by construction. No load-bearing self-citations, no ansatz smuggling, and no renaming of known results appear in the provided text. The central claims remain self-contained against external data.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract provides no details on specific free parameters, axioms, or invented entities; the model and protocol are described at a high level without equations or implementation specifics.

pith-pipeline@v0.9.0 · 5583 in / 1286 out tokens · 84777 ms · 2026-05-12T04:02:58.753454+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

[1]

Javier E Alfonso-Ramos, Rebecca M Neeser, and Thijs Stuyver. Repurposing quantum chemical descriptor datasets for on-the-fly generation of informative reaction representations: Application to hydrogen atom transfer reactions.Digital Discovery, 3(5):919–931, 2024

work page 2024
[2]

BOOM: Benchmarking out-of-distribution molecular property predictions of machine learning models.arXiv preprint arXiv:2505.01912, 2025

Evan R Antoniuk, Shehtab Zaman, Tal Ben-Nun, Peggy Li, James Diffenderfer, Busra Sahin, Obadiah Smolenski, Tim Hsu, Anna M Hiszpanski, Kenneth Chiu, et al. BOOM: Benchmarking out-of-distribution molecular property predictions of machine learning models.arXiv preprint arXiv:2505.01912, 2025

work page arXiv 2025
[3]

On representing chemical environments.Physical Review B—Condensed Matter and Materials Physics, 87(18):184115, 2013

Albert P Bartók, Risi Kondor, and Gábor Csányi. On representing chemical environments.Physical Review B—Condensed Matter and Materials Physics, 87(18):184115, 2013

work page 2013
[4]

Ilyes Batatia, Dávid Péter Kovács, Gregor N. C. Simm, Christoph Ortner, and Gábor Csányi. MACE: Higher order equivariant message passing neural networks for fast and accurate force fields, 2023. URL https://arxiv.org/abs/2206.07697

work page arXiv 2023
[5]

Transferable atomic multipole machine learning models for small organic molecules.Journal of Chemical Theory and Computation, 11(7): 3225–3233, 2015

Tristan Bereau, Denis Andrienko, and O Anatole V on Lilienfeld. Transferable atomic multipole machine learning models for small organic molecules.Journal of Chemical Theory and Computation, 11(7): 3225–3233, 2015

work page 2015
[6]

Weisfeiler and Lehman go cellular: CW networks.Advances in Neural Information Processing Systems, 34:2625–2640, 2021

Cristian Bodnar, Fabrizio Frasca, Nina Otter, Yuguang Wang, Pietro Lio, Guido F Montufar, and Michael Bronstein. Weisfeiler and Lehman go cellular: CW networks.Advances in Neural Information Processing Systems, 34:2625–2640, 2021

work page 2021
[7]

Incorporating noncovalent interactions in transfer learning gaussian process regression models for molecular simulations

Matthew L Brown, Bienfait K Isamura, Jonathan M Skelton, and Paul LA Popelier. Incorporating noncovalent interactions in transfer learning gaussian process regression models for molecular simulations. Journal of Chemical Theory and Computation, 20(14):5994–6008, 2024

work page 2024
[8]

Jordan, Marisa Gliege, Santiago Miret, Vijay Kris Narasimhan, and Rocío Mercado

Pablo Martínez Crespo, Robert S. Jordan, Marisa Gliege, Santiago Miret, Vijay Kris Narasimhan, and Rocío Mercado. Topomole: Topological message passing meets hyperedge messages. InAI for Accelerated Materials Design - NeurIPS 2025, 2025. URLhttps://openreview.net/forum?id=k39fE6SD4u

work page 2025
[9]

NNAIMQ: A neural network model for predicting QTAIM charges.The Journal of Chemical Physics, 156(1), 2022

Miguel Gallegos, Jose Manuel Guevara-Vela, and Ángel Martín Pendás. NNAIMQ: A neural network model for predicting QTAIM charges.The Journal of Chemical Physics, 156(1), 2022

work page 2022
[10]

Explainable chemical artificial intelligence from accurate machine learning of real-space chemical descriptors.Nature Communications, 15(1):4345, 2024

Miguel Gallegos, Valentin Vassilev-Galindo, Igor Poltavsky, Ángel Martín Pendás, and Alexandre Tkatchenko. Explainable chemical artificial intelligence from accurate machine learning of real-space chemical descriptors.Nature Communications, 15(1):4345, 2024

work page 2024
[11]

Sample efficiency matters: A benchmark for practical molecular optimization.Advances in Neural Information Processing Systems, 35:21342–21357, 2022

Wenhao Gao, Tianfan Fu, Jimeng Sun, and Connor Coley. Sample efficiency matters: A benchmark for practical molecular optimization.Advances in Neural Information Processing Systems, 35:21342–21357, 2022

work page 2022
[12]

Albert Gu and Tri Dao

Johannes Gasteiger, Janek Groß, and Stephan Günnemann. Directional message passing for molecular graphs.arXiv preprint arXiv:2003.03123, 2020

work page arXiv 2003
[13]

Multi-level qtaim-enriched graph neural networks for resolving properties of transition metal complexes.Digital Discovery, 4(11): 3378–3388, 2025

Winston Gee, Abigail Doyle, Santiago Vargas, and Anastassia N Alexandrova. Multi-level qtaim-enriched graph neural networks for resolving properties of transition metal complexes.Digital Discovery, 4(11): 3378–3388, 2025

work page 2025
[14]

Regio-selectivity prediction with a machine- learned reaction representation and on-the-fly quantum mechanical descriptors.Chemical Science, 12(6): 2198–2208, 2021

Yanfei Guan, Connor W Coley, Haoyang Wu, Duminda Ranasinghe, Esther Heid, Thomas J Struble, Lagnajit Pattanaik, William H Green, and Klavs F Jensen. Regio-selectivity prediction with a machine- learned reaction representation and on-the-fly quantum mechanical descriptors.Chemical Science, 12(6): 2198–2208, 2021

work page 2021
[15]

Democratising real-world drug discovery through agentic ai.Drug Discovery Today, page 104605, 2026

Jiazhen He, Helen Lai, Lakshidaa Saigiridharan, Gian Marco Ghiandoni, Kinga Jenei, Umur Gokalp, Ajsa Nukovic, Ola Engkvist, Jon Paul Janet, and Samuel Genheden. Democratising real-world drug discovery through agentic ai.Drug Discovery Today, page 104605, 2026

work page 2026
[16]

A quantum lens on molecular design: A machine-learned energy function from interacting quantum atoms.bioRxiv, pages 2026–03, 2026

M Hoffmann, A Kazimir, T Oestereich, L Kaermer, F Engelberger, J Meiler, and C Lamers. A quantum lens on molecular design: A machine-learned energy function from interacting quantum atoms.bioRxiv, pages 2026–03, 2026

work page 2026
[17]

SemlaFlow–Efficient 3D molecular generation with latent attention and equivariant flow matching.arXiv preprint arXiv:2406.07266, 2024

Ross Irwin, Alessandro Tibo, Jon Paul Janet, and Simon Olsson. SemlaFlow–Efficient 3D molecular generation with latent attention and equivariant flow matching.arXiv preprint arXiv:2406.07266, 2024. 11

work page arXiv 2024
[18]

Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies.Chemical Science, 12(3):1163–1175, 2021

Kjell Jorner, Tore Brinck, Per-Ola Norrby, and David Buttar. Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies.Chemical Science, 12(3):1163–1175, 2021

work page 2021
[19]

Koichiro Kato, Tomohide Masuda, Chiduru Watanabe, Naoki Miyagawa, Hideo Mizouchi, Shumpei Nagase, Kikuko Kamisaka, Kanji Oshima, Satoshi Ono, Hiroshi Ueda, et al. High-precision atomic charge prediction for protein systems using fragment molecular orbital calculation and machine learning.Journal of Chemical Information and Modeling, 60(7):3361–3368, 2020

work page 2020
[20]

Mace-off: Short-range transferable machine learning force fields for organic molecules.Journal of the American Chemical Society, 147(21):17598–17611, 2025

Dávid Péter Kovács, J Harry Moore, Nicholas J Browning, Ilyes Batatia, Joshua T Horton, Yixuan Pu, Venkat Kapil, William C Witt, Ioan-Bogdan Magdau, Daniel J Cole, et al. Mace-off: Short-range transferable machine learning force fields for organic molecules.Journal of the American Chemical Society, 147(21):17598–17611, 2025

work page 2025
[21]

Levine, Muhammed Shuaibi, Evan Walter Clark Spotte-Smith, Michael G

Daniel S Levine, Muhammed Shuaibi, Evan Walter Clark Spotte-Smith, Michael G Taylor, Muhammad R Hasyim, Kyle Michel, Ilyes Batatia, Gábor Csányi, Misko Dzamba, Peter Eastman, et al. The Open Molecules 2025 (OMol25) dataset, evaluations, and models.arXiv preprint arXiv:2505.08762, 2025

work page arXiv 2025
[22]

Scalable emulation of protein equilibrium ensembles with generative deep learning.Science, 389(6761):eadv9817, 2025

Sarah Lewis, Tim Hempel, José Jiménez-Luna, Michael Gastegger, Yu Xie, Andrew YK Foong, Victor Gar- cía Satorras, Osama Abdin, Bastiaan S Veeling, Iryna Zaporozhets, et al. Scalable emulation of protein equilibrium ensembles with generative deep learning.Science, 389(6761):eadv9817, 2025

work page 2025
[23]

When do quantum mechanical descriptors help graph neural networks to predict chemical properties? Journal of the American Chemical Society, 146(33):23103–23120, 2024

Shih-Cheng Li, Haoyang Wu, Angiras Menon, Kevin A Spiekermann, Yi-Pei Li, and William H Green. When do quantum mechanical descriptors help graph neural networks to predict chemical properties? Journal of the American Chemical Society, 146(33):23103–23120, 2024

work page 2024
[24]

An introduction to the quantum theory of atoms in molecules.The quantum theory of atoms in molecules, 1, 2007

Chérif F Matta and Russell J Boyd. An introduction to the quantum theory of atoms in molecules.The quantum theory of atoms in molecules, 1, 2007

work page 2007
[25]

The prediction of topologically partitioned intra-atomic and inter-atomic energies by the machine learning method kriging

Peter Maxwell, Nicodemo di Pasquale, Salvatore Cardamone, and Paul LA Popelier. The prediction of topologically partitioned intra-atomic and inter-atomic energies by the machine learning method kriging. Theoretical Chemistry Accounts, 135(8):195, 2016

work page 2016
[26]

hdbscan: Hierarchical density based clustering.Journal of Open Source Software, 2(11):205, 2017

Leland McInnes, John Healy, Steve Astels, et al. hdbscan: Hierarchical density based clustering.Journal of Open Source Software, 2(11):205, 2017

work page 2017
[27]

Electron-passing neural networks for atomic charge prediction in systems with arbitrary molecular charge.Journal of Chemical Information and Modeling, 61(1):115–122, 2020

Derek P Metcalf, Andy Jiang, Steven A Spronk, Daniel L Cheney, and C David Sherrill. Electron-passing neural networks for atomic charge prediction in systems with arbitrary molecular charge.Journal of Chemical Information and Modeling, 61(1):115–122, 2020

work page 2020
[28]

Quantum topological atomic properties of 44k molecules.Scientific Data, 11(1):945, 2024

Brandon Meza-González, David I Ramírez-Palma, Pablo Carpio-Martínez, David Vázquez-Cuevas, Ka- rina Martínez-Mayorga, and Fernando Cortés-Guzmán. Quantum topological atomic properties of 44k molecules.Scientific Data, 11(1):945, 2024

work page 2024
[29]

Ramírez-Palma, Pablo Carpio-Martínez, David Vázquez-Cuevas, Karina Martinez-Mayorga, and Fernando Cortés-Guzmán

Brandon Meza-González, David I. Ramírez-Palma, Pablo Carpio-Martínez, David Vázquez-Cuevas, Karina Martinez-Mayorga, and Fernando Cortés-Guzmán. Aimel-db: Atomic properties for 44k small organic molecules, February 2024. URLhttps://doi.org/10.5281/zenodo.11406726

work page doi:10.5281/zenodo.11406726 2024
[30]

Density functional approach to the frontier-electron theory of chemical reactivity.Journal of the American Chemical Society, 106(14):4049–4050, 1984

Robert G Parr and Weitao Yang. Density functional approach to the frontier-electron theory of chemical reactivity.Journal of the American Chemical Society, 106(14):4049–4050, 1984

work page 1984
[31]

Boltz-2: Towards accurate and efficient binding affinity prediction.bioRxiv, 2025

Saro Passaro, Gabriele Corso, Jeremy Wohlwend, Mateo Reveiz, Stephan Thaler, Vignesh Ram Somnath, Noah Getz, Tally Portnoi, Julien Roy, Hannes Stark, et al. Boltz-2: Towards accurate and efficient binding affinity prediction.bioRxiv, 2025

work page 2025
[32]

Smooth, exact rotational symmetrization for deep learning on point clouds.Advances in Neural Information Processing Systems, 36:79469–79501, 2023

Sergey Pozdnyakov and Michele Ceriotti. Smooth, exact rotational symmetrization for deep learning on point clouds.Advances in Neural Information Processing Systems, 36:79469–79501, 2023

work page 2023
[33]

Quantum chemistry structures and properties of 134 kilo molecules.Scientific Data, 1(1):1–7, 2014

Raghunathan Ramakrishnan, Pavlo O Dral, Matthias Rupp, and O Anatole V on Lilienfeld. Quantum chemistry structures and properties of 134 kilo molecules.Scientific Data, 1(1):1–7, 2014

work page 2014
[34]

Equivariant message passing for the prediction of tensorial properties and molecular spectra

Kristof Schütt, Oliver Unke, and Michael Gastegger. Equivariant message passing for the prediction of tensorial properties and molecular spectra. InInternational Conference on Machine Learning, pages 9377–9388. PMLR, 2021

work page 2021
[35]

Have protein-ligand co-folding methods moved beyond memorisation?bioRxiv, pages 2025–02, 2025

Peter Škrinjar, Jérôme Eberhardt, Janani Durairaj, and Torsten Schwede. Have protein-ligand co-folding methods moved beyond memorisation?bioRxiv, pages 2025–02, 2025. 12

work page 2025
[36]

Quantum chemistry-augmented neural networks for reactivity prediction: Performance, generalizability, and explainability.The Journal of Chemical Physics, 156(8), 2022

Thijs Stuyver and Connor W Coley. Quantum chemistry-augmented neural networks for reactivity prediction: Performance, generalizability, and explainability.The Journal of Chemical Physics, 156(8), 2022

work page 2022
[37]

High-throughput quantum theory of atoms in molecules (QTAIM) for geometric deep learning of molecular and reaction properties.Digital Discovery, 3(5):987–998, 2024

Santiago Vargas, Winston Gee, and Anastassia Alexandrova. High-throughput quantum theory of atoms in molecules (QTAIM) for geometric deep learning of molecular and reaction properties.Digital Discovery, 3(5):987–998, 2024

work page 2024
[38]

Predicting molecular dipole moments by combining atomic partial charges and atomic dipoles.The Journal of chemical physics, 153(2), 2020

Max Veit, David M Wilkins, Yang Yang, Robert A DiStasio, and Michele Ceriotti. Predicting molecular dipole moments by combining atomic partial charges and atomic dipoles.The Journal of chemical physics, 153(2), 2020

work page 2020
[39]

DeepAtomicCharge: A new graph convolutional network-based architecture for accurate prediction of atomic charges.Briefings in Bioinformatics, 22(3):bbaa183, 2021

Jike Wang, Dongsheng Cao, Cunchen Tang, Lei Xu, Qiaojun He, Bo Yang, Xi Chen, Huiyong Sun, and Tingjun Hou. DeepAtomicCharge: A new graph convolutional network-based architecture for accurate prediction of atomic charges.Briefings in Bioinformatics, 22(3):bbaa183, 2021

work page 2021
[40]

A generative model for inorganic materials design.Nature, 639(8055):624–632, 2025

Claudio Zeni, Robert Pinsler, Daniel Zügner, Andrew Fowler, Matthew Horton, Xiang Fu, Zilong Wang, Aliaksandra Shysheya, Jonathan Crabbé, Shoko Ueda, et al. A generative model for inorganic materials design.Nature, 639(8055):624–632, 2025. 13 A Technical Appendices and Supplementary Material A.1 Training and computational details We report training and co...

work page 2025
[41]

ICC (∆)" and

— arise naturally when training-set variability is dominated by which environments areexcluded rather than included, and do not indicate a violation. The effective sample size neff ∈[16.5,25.0] across models reflects this near-independence and confirms that the standard repeated-CV estimator remains close to unbiased. Normality and homoscedasticity (SW an...

work page arXiv