pith. sign in

arxiv: 2605.16891 · v1 · pith:FHFH2UCHnew · submitted 2026-05-16 · 💻 cs.LG

Tensor Channel Equivariant Graph Neural Networks for Molecular Polarizability Prediction

Pith reviewed 2026-05-19 20:30 UTC · model grok-4.3

classification 💻 cs.LG
keywords graph neural networksequivariant networksmolecular polarizabilitytensor predictionquantum machine learningPaiNN
0
0 comments X p. Extension
pith:FHFH2UCH Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{FHFH2UCH}

Prints a linked pith:FHFH2UCH badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

The pith

Propagating explicit tensor channels through message passing improves molecular polarizability tensor predictions over readout-only baselines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes an equivariant graph neural network that augments its hidden states with symmetric rank-2 tensor channels and carries tensor structure through every message-passing step using geometrically motivated bases. These channels are aligned with the decomposition of polarizability into isotropic and traceless anisotropic parts. A reader would care because direct, structure-preserving prediction of tensor properties can reduce error without increasing model size, which matters for high-throughput screening in computational chemistry. The authors show that this target-aligned propagation yields lower full-tensor and anisotropic errors than both a PaiNN-style readout model and a dielectric MACE model on optimized QM7-X geometries, under matched training conditions and parameter count. Ablations indicate the gains come from the combination of tensor propagation and traceless parameterization rather than capacity alone.

Core claim

The central claim is that a tensor-channel equivariant graph neural network, which propagates explicit symmetric rank-2 tensor channels throughout message passing with geometrically motivated bases, achieves lower full-tensor and anisotropic error than PaiNN-style readout baselines and dielectric MACE baselines on optimized QM7-X geometries at nearly identical parameter count. The improvement is attributed to explicit tensor propagation combined with a traceless target parameterization matched to the anisotropic component, rather than to increased capacity. Rotational equivariance tests confirm that all models preserve equivariance, so the accuracy differences arise from better learning of,

What carries the argument

Symmetric rank-2 tensor channels propagated through message passing using interactions between learned directional features, aligned with the isotropic-anisotropic decomposition of the polarizability tensor.

If this is right

  • The model reports lower full-tensor and anisotropic error than both a PaiNN-style readout baseline and a dielectric MACE baseline under matched training conditions.
  • It outperforms MACE while remaining substantially faster at inference.
  • The gain does not arise from increased capacity alone but from the combination of explicit tensor propagation and traceless target parameterization.
  • Among tensor bases, interactions between learned directional features produce the strongest results.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same target-aligned propagation idea could be tested on other rank-2 tensor properties such as molecular quadrupole moments.
  • If the directional-feature interactions remain effective on larger or more diverse datasets, the architecture may scale to high-throughput property screening.
  • A direct comparison on non-optimized geometries would clarify whether the benefit persists when input structures carry realistic thermal noise.

Load-bearing premise

The performance gain comes from the explicit tensor propagation and traceless parameterization rather than from differences in optimization dynamics or data preprocessing that were not fully controlled.

What would settle it

Retraining the proposed model and all baselines with identical random seeds, optimizer schedules, and preprocessing steps and finding no statistically significant error reduction would falsify the claim that the tensor-channel architecture is responsible for the improvement.

Figures

Figures reproduced from arXiv: 2605.16891 by Daniel Franzen, Jean Philip Filling, Michael Wand.

Figure 1
Figure 1. Figure 1: Spherical-harmonic decomposition of a symmetric second-order tensor. For the polarizability tensor, only the isotropic ℓ = 0 and the traceless symmetric anisotropic ℓ = 2 components remain. In the spherical-harmonic basis shown in [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Equivariant Architecture for Rank-2 Tensors [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Median relative deviatoric error as a function of the number of heavy atoms. Numbers above the plot indicate the num￾ber of test samples in each bin. To further analyze the anisotropic pre￾diction quality, we evaluate the relative deviatoric error as a function of molec￾ular size. For a predicted polarizability tensor ˆα and target tensor α, we define the relative deviatoric error as ∥dev(ˆα) − dev(α)∥F ∥d… view at source ↗
read the original abstract

We introduce a tensor-channel equivariant graph neural network for direct prediction of molecular polarizability tensors. Building on the efficient PaiNN architecture, we augment the hidden representation with explicit symmetric rank-2 tensor channels aligned with the decomposition of polarizability into isotropic and anisotropic components. In contrast to approaches that construct tensor outputs only at readout, our model propagates tensor structure throughout message passing using geometrically motivated tensor bases. This yields a target-aligned architecture for tensor-valued molecular prediction. On optimized QM7-X geometries, the proposed model achieves lower full-tensor and anisotropic error than both a PaiNN-style readout baseline and a dielectric MACE baseline under matched training conditions and at nearly identical parameter count. In this controlled setting, it also outperforms MACE while remaining substantially faster at inference. Ablation studies show that the gain does not arise from increased capacity alone, but from the combination of explicit tensor propagation and a traceless target parameterization matched to the anisotropic part of the polarizability tensor. Among the tensor bases considered, the strongest results are obtained from interactions between learned directional features, indicating that these are particularly effective for modeling molecular polarizability. Rotational equivariance tests further confirm that all compared models are numerically equivariant, so the observed improvements are attributable to better learning of the target tensor itself. Overall, our results show that for structured tensor-valued targets, propagating target-aligned tensor features can outperform both readout-only tensor construction and a more general higher-order equivariant model in the present training setting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a tensor-channel equivariant GNN extending PaiNN with explicit symmetric rank-2 tensor channels that propagate throughout message passing, aligned to the isotropic/anisotropic decomposition of molecular polarizability. On optimized QM7-X geometries it reports lower full-tensor and anisotropic error than a PaiNN-style readout baseline and a dielectric MACE baseline at nearly identical parameter count under matched training conditions; ablations attribute the gains to tensor propagation plus traceless target parameterization rather than capacity alone, and rotational-equivariance tests confirm numerical equivariance of all models.

Significance. If the attribution to tensor propagation holds after tighter controls, the result provides concrete evidence that target-aligned tensor features can outperform both readout-only tensor construction and more general higher-order equivariant models for structured tensor targets in molecular ML, while remaining computationally lighter than MACE. This strengthens the case for architecture choices that mirror the algebraic structure of the target rather than relying solely on universal equivariant layers.

major comments (2)
  1. [§4] §4 (experimental results and ablation studies): the central claim that gains arise from explicit tensor propagation and traceless parameterization rather than optimization or preprocessing differences rests on the assertion of 'matched training conditions' and 'nearly identical parameter count.' No quantitative confirmation is supplied that learning-rate schedules, batch sizes, early-stopping criteria, optimizer states, or coordinate centering/normalization were locked identically across the PaiNN readout baseline, dielectric MACE, and the proposed model; this directly affects whether the ablation isolates the architectural contribution.
  2. [Results tables] Results tables (e.g., Table 2 or equivalent performance table): while lower errors are reported, the absence of error bars, exact numerical values for full-tensor and anisotropic MAE, and details on training/validation/test splits in the visible abstract and summary sections makes it difficult to assess statistical robustness of the claimed improvements over baselines.
minor comments (2)
  1. [§3.2] §3.2 (tensor bases): the description of interactions between learned directional features would be clearer with an explicit diagram or pseudocode showing how the geometrically motivated bases are constructed and combined during message passing.
  2. [Methods] Notation: the distinction between the full polarizability tensor and its traceless anisotropic part is used throughout but could be reinforced by a short reminder equation in the methods when the target parameterization is introduced.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help clarify the experimental rigor of our work. We address each major point below and have revised the manuscript accordingly to provide the requested details and strengthen the presentation of results.

read point-by-point responses
  1. Referee: [§4] §4 (experimental results and ablation studies): the central claim that gains arise from explicit tensor propagation and traceless parameterization rather than optimization or preprocessing differences rests on the assertion of 'matched training conditions' and 'nearly identical parameter count.' No quantitative confirmation is supplied that learning-rate schedules, batch sizes, early-stopping criteria, optimizer states, or coordinate centering/normalization were locked identically across the PaiNN readout baseline, dielectric MACE, and the proposed model; this directly affects whether the ablation isolates the architectural contribution.

    Authors: We agree that explicit documentation of the training protocol is necessary to isolate architectural effects. All models were trained under identical conditions using the same codebase, optimizer (Adam), learning-rate schedule (with the same initial rate and decay), batch size, early-stopping patience, and preprocessing pipeline (including identical coordinate centering and normalization). In the revised manuscript we will insert a new subsection in §4 that tabulates these exact shared settings for each model, along with a statement confirming that no per-model hyperparameter tuning was performed beyond architecture-specific adjustments required for tensor channels. This addition directly addresses the concern and allows readers to reproduce the matched conditions. revision: yes

  2. Referee: [Results tables] Results tables (e.g., Table 2 or equivalent performance table): while lower errors are reported, the absence of error bars, exact numerical values for full-tensor and anisotropic MAE, and details on training/validation/test splits in the visible abstract and summary sections makes it difficult to assess statistical robustness of the claimed improvements over baselines.

    Authors: We acknowledge that error bars and precise numerical reporting improve statistical interpretability. The full experimental section already specifies the QM7-X split (80/10/10) and that results are averaged over three random seeds, but these details were not highlighted in the main tables. In the revision we will (i) add standard-deviation error bars to all reported MAE values in the performance tables, (ii) list the exact numerical MAE figures for both full-tensor and anisotropic components, and (iii) explicitly restate the train/validation/test split ratios and seed-averaging procedure in the table captions and §4 text. These changes make the robustness of the improvements transparent without altering any conclusions. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical model comparison with independent benchmarks

full rationale

The paper introduces an architectural extension of PaiNN with explicit tensor channels for direct polarizability tensor prediction and supports its claims via controlled empirical comparisons on QM7-X against PaiNN-style and MACE baselines, plus ablation studies attributing gains to tensor propagation and traceless parameterization. No mathematical derivation, uniqueness theorem, or prediction is presented that reduces by the paper's own equations or self-citation to a fitted input or prior ansatz by construction. All load-bearing steps rely on external data and published baselines rather than internal redefinition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard assumptions of equivariant message passing and on the empirical observation that tensor propagation improves learning for this target; no new physical entities or ad-hoc constants are introduced beyond model hyperparameters.

axioms (1)
  • domain assumption Rotational equivariance of the network is preserved when tensor channels are propagated using geometrically motivated bases.
    Invoked in the description of the architecture and confirmed by rotational equivariance tests.

pith-pipeline@v0.9.0 · 5795 in / 1287 out tokens · 26367 ms · 2026-05-19T20:30:07.786305+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    We introduce a tensor-channel equivariant graph neural network... augment the hidden representation with explicit symmetric rank-2 tensor channels aligned with the decomposition of polarizability into isotropic and anisotropic components... traceless target parameterization

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    Using a symmetric traceless basis is particularly natural for molecular polarizability, where the trace corresponds to the isotropic average response, while the traceless part captures the directional anisotropy.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · 2 internal anchors

  1. [1]

    arXiv preprint arXiv:2602.19411 (2026)

    Batatia, I., Baldwin, W.J., Kuryla, D., Hart, J., Kasoar, E., Elena, A.M., Moore, H., Gawkowski, M.J., Shi, B.X., Kapil, V., et al.: Mace-polar-1: A po- larisable electrostatic foundation model for molecular chemistry. arXiv preprint arXiv:2602.19411 (2026)

  2. [2]

    Advances in neural information processing systems35, 11423–11436 (2022)

    Batatia, I., Kovacs, D.P., Simm, G., Ortner, C., Cs´ anyi, G.: Mace: Higher order equivariant message passing neural networks for fast and accurate force fields. Advances in neural information processing systems35, 11423–11436 (2022)

  3. [3]

    Nature communications13(1), 2453 (2022)

    Batzner, S., Musaelian, A., Sun, L., Geiger, M., Mailoa, J.P., Kornbluth, M., Moli- nari, N., Smidt, T.E., Kozinsky, B.: E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature communications13(1), 2453 (2022)

  4. [4]

    04615 12 Filling et al

    Cohen, T.S., Weiler, M., Kicanaoglu, B., Welling, M.: Gauge equivariant convo- lutional networks and the icosahedral cnn (2019),https://arxiv.org/abs/1902. 04615 12 Filling et al

  5. [5]

    Cohen, T.S., Welling, M.: Group equivariant convolutional networks (2016),https: //arxiv.org/abs/1602.07576

  6. [6]

    arXiv preprint arXiv:2511.07087 (2025)

    Filling, J.P., Post, F., Wand, M., Andrienko, D.: Direct molecular polariz- ability prediction with so (3) equivariant local frame gnns. arXiv preprint arXiv:2511.07087 (2025)

  7. [7]

    Advances in neural information processing systems34, 9086–9098 (2021)

    Franzen, D., Wand, M.: General nonlinearities in so (2)-equivariant cnns. Advances in neural information processing systems34, 9086–9098 (2021)

  8. [8]

    arXiv preprint arXiv:2502.12147 (2025)

    Fu, X., Wood, B.M., Barroso-Luque, L., Levine, D.S., Gao, M., Dzamba, M., Zit- nick, C.L.: Learning smooth and expressive interatomic potentials for physical property prediction. arXiv preprint arXiv:2502.12147 (2025)

  9. [9]

    Advances in neural information pro- cessing systems33, 1970–1981 (2020)

    Fuchs, F., Worrall, D., Fischer, V., Welling, M.: Se (3)-transformers: 3d roto- translation equivariant attention networks. Advances in neural information pro- cessing systems33, 1970–1981 (2020)

  10. [10]

    arXiv preprint arXiv:2011.14115 (2020)

    Gasteiger, J., Giri, S., Margraf, J.T., G¨ unnemann, S.: Fast and uncertainty- aware directional message passing for non-equilibrium molecules. arXiv preprint arXiv:2011.14115 (2020)

  11. [11]

    arXiv preprint arXiv:2509.26499 (2025)

    Gerhartz, G., Lippmann, P., Hamprecht, F.A.: Equivariance by local canonicaliza- tion: A matter of representation. arXiv preprint arXiv:2509.26499 (2025)

  12. [12]

    Scientific data 8(1), 43 (2021)

    Hoja, J., Medrano Sandonas, L., Ernst, B.G., Vazquez-Mayagoitia, A., DiStasio Jr, R.A., Tkatchenko, A.: Qm7-x, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules. Scientific data 8(1), 43 (2021)

  13. [13]

    arXiv preprint arXiv:2405.15389 (2024)

    Lippmann, P., Gerhartz, G., Remme, R., Hamprecht, F.A.: Beyond canonicaliza- tion: How tensorial messages improve equivariant message passing. arXiv preprint arXiv:2405.15389 (2024)

  14. [14]

    In: International Conference on Learning Representations (2020)

    Mehmeti-G¨ opel, C.H.A., Hartmann, D., Wand, M.: Ringing relus: Harmonic dis- tortion analysis of nonlinear feedforward networks. In: International Conference on Learning Representations (2020)

  15. [15]

    In: International conference on machine learning

    Satorras, V.G., Hoogeboom, E., Welling, M.: E (n) equivariant graph neural net- works. In: International conference on machine learning. pp. 9323–9332. PMLR (2021)

  16. [16]

    Advances in neural information processing systems30 (2017)

    Sch¨ utt, K., Kindermans, P.J., Sauceda Felix, H.E., Chmiela, S., Tkatchenko, A., M¨ uller, K.R.: Schnet: A continuous-filter convolutional neural network for model- ing quantum interactions. Advances in neural information processing systems30 (2017)

  17. [17]

    In: International conference on machine learning

    Sch¨ utt, K., Unke, O., Gastegger, M.: Equivariant message passing for the predic- tion of tensorial properties and molecular spectra. In: International conference on machine learning. pp. 9377–9388. PMLR (2021)

  18. [18]

    Advances in Neural Information Processing Systems36, 37334–37353 (2023)

    Simeon, G., De Fabritiis, G.: Tensornet: Cartesian tensor representations for effi- cient learning of molecular potentials. Advances in Neural Information Processing Systems36, 37334–37353 (2023)

  19. [19]

    Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds

    Thomas, N., Smidt, T., Kearnes, S., Yang, L., Li, L., Kohlhoff, K., Riley, P.: Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv preprint arXiv:1802.08219 (2018)

  20. [20]

    Journal of chemical theory and computation 15(6), 3678–3693 (2019)

    Unke, O.T., Meuwly, M.: Physnet: A neural network for predicting energies, forces, dipole moments, and partial charges. Journal of chemical theory and computation 15(6), 3678–3693 (2019)

  21. [21]

    Advances in Neural information processing systems31(2018)

    Weiler, M., Geiger, M., Welling, M., Boomsma, W., Cohen, T.S.: 3d steerable cnns: Learning rotationally equivariant features in volumetric data. Advances in Neural information processing systems31(2018)