Tensor Channel Equivariant Graph Neural Networks for Molecular Polarizability Prediction
Pith reviewed 2026-05-19 20:30 UTC · model grok-4.3
pith:FHFH2UCH Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{FHFH2UCH}
Prints a linked pith:FHFH2UCH badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
The pith
Propagating explicit tensor channels through message passing improves molecular polarizability tensor predictions over readout-only baselines.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a tensor-channel equivariant graph neural network, which propagates explicit symmetric rank-2 tensor channels throughout message passing with geometrically motivated bases, achieves lower full-tensor and anisotropic error than PaiNN-style readout baselines and dielectric MACE baselines on optimized QM7-X geometries at nearly identical parameter count. The improvement is attributed to explicit tensor propagation combined with a traceless target parameterization matched to the anisotropic component, rather than to increased capacity. Rotational equivariance tests confirm that all models preserve equivariance, so the accuracy differences arise from better learning of,
What carries the argument
Symmetric rank-2 tensor channels propagated through message passing using interactions between learned directional features, aligned with the isotropic-anisotropic decomposition of the polarizability tensor.
If this is right
- The model reports lower full-tensor and anisotropic error than both a PaiNN-style readout baseline and a dielectric MACE baseline under matched training conditions.
- It outperforms MACE while remaining substantially faster at inference.
- The gain does not arise from increased capacity alone but from the combination of explicit tensor propagation and traceless target parameterization.
- Among tensor bases, interactions between learned directional features produce the strongest results.
Where Pith is reading between the lines
- The same target-aligned propagation idea could be tested on other rank-2 tensor properties such as molecular quadrupole moments.
- If the directional-feature interactions remain effective on larger or more diverse datasets, the architecture may scale to high-throughput property screening.
- A direct comparison on non-optimized geometries would clarify whether the benefit persists when input structures carry realistic thermal noise.
Load-bearing premise
The performance gain comes from the explicit tensor propagation and traceless parameterization rather than from differences in optimization dynamics or data preprocessing that were not fully controlled.
What would settle it
Retraining the proposed model and all baselines with identical random seeds, optimizer schedules, and preprocessing steps and finding no statistically significant error reduction would falsify the claim that the tensor-channel architecture is responsible for the improvement.
Figures
read the original abstract
We introduce a tensor-channel equivariant graph neural network for direct prediction of molecular polarizability tensors. Building on the efficient PaiNN architecture, we augment the hidden representation with explicit symmetric rank-2 tensor channels aligned with the decomposition of polarizability into isotropic and anisotropic components. In contrast to approaches that construct tensor outputs only at readout, our model propagates tensor structure throughout message passing using geometrically motivated tensor bases. This yields a target-aligned architecture for tensor-valued molecular prediction. On optimized QM7-X geometries, the proposed model achieves lower full-tensor and anisotropic error than both a PaiNN-style readout baseline and a dielectric MACE baseline under matched training conditions and at nearly identical parameter count. In this controlled setting, it also outperforms MACE while remaining substantially faster at inference. Ablation studies show that the gain does not arise from increased capacity alone, but from the combination of explicit tensor propagation and a traceless target parameterization matched to the anisotropic part of the polarizability tensor. Among the tensor bases considered, the strongest results are obtained from interactions between learned directional features, indicating that these are particularly effective for modeling molecular polarizability. Rotational equivariance tests further confirm that all compared models are numerically equivariant, so the observed improvements are attributable to better learning of the target tensor itself. Overall, our results show that for structured tensor-valued targets, propagating target-aligned tensor features can outperform both readout-only tensor construction and a more general higher-order equivariant model in the present training setting.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a tensor-channel equivariant GNN extending PaiNN with explicit symmetric rank-2 tensor channels that propagate throughout message passing, aligned to the isotropic/anisotropic decomposition of molecular polarizability. On optimized QM7-X geometries it reports lower full-tensor and anisotropic error than a PaiNN-style readout baseline and a dielectric MACE baseline at nearly identical parameter count under matched training conditions; ablations attribute the gains to tensor propagation plus traceless target parameterization rather than capacity alone, and rotational-equivariance tests confirm numerical equivariance of all models.
Significance. If the attribution to tensor propagation holds after tighter controls, the result provides concrete evidence that target-aligned tensor features can outperform both readout-only tensor construction and more general higher-order equivariant models for structured tensor targets in molecular ML, while remaining computationally lighter than MACE. This strengthens the case for architecture choices that mirror the algebraic structure of the target rather than relying solely on universal equivariant layers.
major comments (2)
- [§4] §4 (experimental results and ablation studies): the central claim that gains arise from explicit tensor propagation and traceless parameterization rather than optimization or preprocessing differences rests on the assertion of 'matched training conditions' and 'nearly identical parameter count.' No quantitative confirmation is supplied that learning-rate schedules, batch sizes, early-stopping criteria, optimizer states, or coordinate centering/normalization were locked identically across the PaiNN readout baseline, dielectric MACE, and the proposed model; this directly affects whether the ablation isolates the architectural contribution.
- [Results tables] Results tables (e.g., Table 2 or equivalent performance table): while lower errors are reported, the absence of error bars, exact numerical values for full-tensor and anisotropic MAE, and details on training/validation/test splits in the visible abstract and summary sections makes it difficult to assess statistical robustness of the claimed improvements over baselines.
minor comments (2)
- [§3.2] §3.2 (tensor bases): the description of interactions between learned directional features would be clearer with an explicit diagram or pseudocode showing how the geometrically motivated bases are constructed and combined during message passing.
- [Methods] Notation: the distinction between the full polarizability tensor and its traceless anisotropic part is used throughout but could be reinforced by a short reminder equation in the methods when the target parameterization is introduced.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which help clarify the experimental rigor of our work. We address each major point below and have revised the manuscript accordingly to provide the requested details and strengthen the presentation of results.
read point-by-point responses
-
Referee: [§4] §4 (experimental results and ablation studies): the central claim that gains arise from explicit tensor propagation and traceless parameterization rather than optimization or preprocessing differences rests on the assertion of 'matched training conditions' and 'nearly identical parameter count.' No quantitative confirmation is supplied that learning-rate schedules, batch sizes, early-stopping criteria, optimizer states, or coordinate centering/normalization were locked identically across the PaiNN readout baseline, dielectric MACE, and the proposed model; this directly affects whether the ablation isolates the architectural contribution.
Authors: We agree that explicit documentation of the training protocol is necessary to isolate architectural effects. All models were trained under identical conditions using the same codebase, optimizer (Adam), learning-rate schedule (with the same initial rate and decay), batch size, early-stopping patience, and preprocessing pipeline (including identical coordinate centering and normalization). In the revised manuscript we will insert a new subsection in §4 that tabulates these exact shared settings for each model, along with a statement confirming that no per-model hyperparameter tuning was performed beyond architecture-specific adjustments required for tensor channels. This addition directly addresses the concern and allows readers to reproduce the matched conditions. revision: yes
-
Referee: [Results tables] Results tables (e.g., Table 2 or equivalent performance table): while lower errors are reported, the absence of error bars, exact numerical values for full-tensor and anisotropic MAE, and details on training/validation/test splits in the visible abstract and summary sections makes it difficult to assess statistical robustness of the claimed improvements over baselines.
Authors: We acknowledge that error bars and precise numerical reporting improve statistical interpretability. The full experimental section already specifies the QM7-X split (80/10/10) and that results are averaged over three random seeds, but these details were not highlighted in the main tables. In the revision we will (i) add standard-deviation error bars to all reported MAE values in the performance tables, (ii) list the exact numerical MAE figures for both full-tensor and anisotropic components, and (iii) explicitly restate the train/validation/test split ratios and seed-averaging procedure in the table captions and §4 text. These changes make the robustness of the improvements transparent without altering any conclusions. revision: yes
Circularity Check
No circularity: empirical model comparison with independent benchmarks
full rationale
The paper introduces an architectural extension of PaiNN with explicit tensor channels for direct polarizability tensor prediction and supports its claims via controlled empirical comparisons on QM7-X against PaiNN-style and MACE baselines, plus ablation studies attributing gains to tensor propagation and traceless parameterization. No mathematical derivation, uniqueness theorem, or prediction is presented that reduces by the paper's own equations or self-citation to a fitted input or prior ansatz by construction. All load-bearing steps rely on external data and published baselines rather than internal redefinition.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Rotational equivariance of the network is preserved when tensor channels are propagated using geometrically motivated bases.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
We introduce a tensor-channel equivariant graph neural network... augment the hidden representation with explicit symmetric rank-2 tensor channels aligned with the decomposition of polarizability into isotropic and anisotropic components... traceless target parameterization
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Using a symmetric traceless basis is particularly natural for molecular polarizability, where the trace corresponds to the isotropic average response, while the traceless part captures the directional anisotropy.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
arXiv preprint arXiv:2602.19411 (2026)
Batatia, I., Baldwin, W.J., Kuryla, D., Hart, J., Kasoar, E., Elena, A.M., Moore, H., Gawkowski, M.J., Shi, B.X., Kapil, V., et al.: Mace-polar-1: A po- larisable electrostatic foundation model for molecular chemistry. arXiv preprint arXiv:2602.19411 (2026)
-
[2]
Advances in neural information processing systems35, 11423–11436 (2022)
Batatia, I., Kovacs, D.P., Simm, G., Ortner, C., Cs´ anyi, G.: Mace: Higher order equivariant message passing neural networks for fast and accurate force fields. Advances in neural information processing systems35, 11423–11436 (2022)
work page 2022
-
[3]
Nature communications13(1), 2453 (2022)
Batzner, S., Musaelian, A., Sun, L., Geiger, M., Mailoa, J.P., Kornbluth, M., Moli- nari, N., Smidt, T.E., Kozinsky, B.: E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature communications13(1), 2453 (2022)
work page 2022
-
[4]
Cohen, T.S., Weiler, M., Kicanaoglu, B., Welling, M.: Gauge equivariant convo- lutional networks and the icosahedral cnn (2019),https://arxiv.org/abs/1902. 04615 12 Filling et al
work page 2019
-
[5]
Cohen, T.S., Welling, M.: Group equivariant convolutional networks (2016),https: //arxiv.org/abs/1602.07576
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[6]
arXiv preprint arXiv:2511.07087 (2025)
Filling, J.P., Post, F., Wand, M., Andrienko, D.: Direct molecular polariz- ability prediction with so (3) equivariant local frame gnns. arXiv preprint arXiv:2511.07087 (2025)
-
[7]
Advances in neural information processing systems34, 9086–9098 (2021)
Franzen, D., Wand, M.: General nonlinearities in so (2)-equivariant cnns. Advances in neural information processing systems34, 9086–9098 (2021)
work page 2021
-
[8]
arXiv preprint arXiv:2502.12147 (2025)
Fu, X., Wood, B.M., Barroso-Luque, L., Levine, D.S., Gao, M., Dzamba, M., Zit- nick, C.L.: Learning smooth and expressive interatomic potentials for physical property prediction. arXiv preprint arXiv:2502.12147 (2025)
-
[9]
Advances in neural information pro- cessing systems33, 1970–1981 (2020)
Fuchs, F., Worrall, D., Fischer, V., Welling, M.: Se (3)-transformers: 3d roto- translation equivariant attention networks. Advances in neural information pro- cessing systems33, 1970–1981 (2020)
work page 1970
-
[10]
arXiv preprint arXiv:2011.14115 (2020)
Gasteiger, J., Giri, S., Margraf, J.T., G¨ unnemann, S.: Fast and uncertainty- aware directional message passing for non-equilibrium molecules. arXiv preprint arXiv:2011.14115 (2020)
-
[11]
arXiv preprint arXiv:2509.26499 (2025)
Gerhartz, G., Lippmann, P., Hamprecht, F.A.: Equivariance by local canonicaliza- tion: A matter of representation. arXiv preprint arXiv:2509.26499 (2025)
-
[12]
Scientific data 8(1), 43 (2021)
Hoja, J., Medrano Sandonas, L., Ernst, B.G., Vazquez-Mayagoitia, A., DiStasio Jr, R.A., Tkatchenko, A.: Qm7-x, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules. Scientific data 8(1), 43 (2021)
work page 2021
-
[13]
arXiv preprint arXiv:2405.15389 (2024)
Lippmann, P., Gerhartz, G., Remme, R., Hamprecht, F.A.: Beyond canonicaliza- tion: How tensorial messages improve equivariant message passing. arXiv preprint arXiv:2405.15389 (2024)
-
[14]
In: International Conference on Learning Representations (2020)
Mehmeti-G¨ opel, C.H.A., Hartmann, D., Wand, M.: Ringing relus: Harmonic dis- tortion analysis of nonlinear feedforward networks. In: International Conference on Learning Representations (2020)
work page 2020
-
[15]
In: International conference on machine learning
Satorras, V.G., Hoogeboom, E., Welling, M.: E (n) equivariant graph neural net- works. In: International conference on machine learning. pp. 9323–9332. PMLR (2021)
work page 2021
-
[16]
Advances in neural information processing systems30 (2017)
Sch¨ utt, K., Kindermans, P.J., Sauceda Felix, H.E., Chmiela, S., Tkatchenko, A., M¨ uller, K.R.: Schnet: A continuous-filter convolutional neural network for model- ing quantum interactions. Advances in neural information processing systems30 (2017)
work page 2017
-
[17]
In: International conference on machine learning
Sch¨ utt, K., Unke, O., Gastegger, M.: Equivariant message passing for the predic- tion of tensorial properties and molecular spectra. In: International conference on machine learning. pp. 9377–9388. PMLR (2021)
work page 2021
-
[18]
Advances in Neural Information Processing Systems36, 37334–37353 (2023)
Simeon, G., De Fabritiis, G.: Tensornet: Cartesian tensor representations for effi- cient learning of molecular potentials. Advances in Neural Information Processing Systems36, 37334–37353 (2023)
work page 2023
-
[19]
Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds
Thomas, N., Smidt, T., Kearnes, S., Yang, L., Li, L., Kohlhoff, K., Riley, P.: Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv preprint arXiv:1802.08219 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[20]
Journal of chemical theory and computation 15(6), 3678–3693 (2019)
Unke, O.T., Meuwly, M.: Physnet: A neural network for predicting energies, forces, dipole moments, and partial charges. Journal of chemical theory and computation 15(6), 3678–3693 (2019)
work page 2019
-
[21]
Advances in Neural information processing systems31(2018)
Weiler, M., Geiger, M., Welling, M., Boomsma, W., Cohen, T.S.: 3d steerable cnns: Learning rotationally equivariant features in volumetric data. Advances in Neural information processing systems31(2018)
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.