pith. sign in

arxiv: 2605.20440 · v1 · pith:GLEGR7WCnew · submitted 2026-05-19 · 💻 cs.LG · cs.AI· math.RA

Group-Algebraic Tensors: Provably-optimal Equivariant Learning and Physical Symmetry Discovery

Pith reviewed 2026-05-21 07:31 UTC · model grok-4.3

classification 💻 cs.LG cs.AImath.RA
keywords equivariant learningtensor algebragroup representationssymmetry discoverymolecular propertiesSVD approximationalgebraic equivarianceWigner-Eckart rules
0
0 comments X

The pith

The ★_G tensor algebra turns any finite group into an intrinsic multiplication rule for tensors so that symmetry preservation becomes algebraic rather than architectural.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines a tensor algebra in which any finite group G supplies the multiplication operation, turning equivariance into a built-in algebraic feature of the tensors themselves. This construction yields a singular value decomposition that comes with an exact Eckart-Young guarantee of optimality among all symmetry-preserving approximations and runs in polynomial time. The same algebra permits closed-form decomposition of every prediction into separate irreducible-representation components and lets the data themselves identify which group best organizes the observations. When the method is run on molecular geometry data it reproduces the known dominance of specific symmetry types for scalar and vector properties without receiving any quantum-mechanical information.

Core claim

The ★_G tensor algebra is defined so that equivariance with respect to a finite group G is an intrinsic algebraic property rather than an added constraint. The algebra supplies an Eckart-Young optimality theorem for its associated singular value decomposition, establishing that the low-rank ★_G approximation is the closest possible symmetry-preserving approximation to any given tensor and can be obtained in polynomial time. Multiple symmetries compose simply by replacing the factor matrix with the Kronecker product of the individual factors. These properties give every prediction an exact per-irreducible-representation decomposition and allow the symmetry group that best fits a dataset to be

What carries the argument

The ★_G tensor algebra, which equips tensors with a multiplication rule derived from any finite group G to make equivariance an algebraic identity.

If this is right

  • The ★_G-SVD supplies the unique best symmetry-preserving approximation to any tensor in the Eckart-Young sense.
  • Distinct symmetries combine without redesign by replacing the factor with the Kronecker product of the separate group factors.
  • Every prediction admits an exact closed-form decomposition into a sum of terms each labeled by one irreducible representation.
  • The symmetry group that best organizes a dataset can be identified directly by comparing how well different groups structure the observed values.
  • Closed-form ridge regression on the decomposed components produces accurate predictions for molecular properties at 50-90 times fewer parameters than matched neural networks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The per-irrep breakdown could be used after training to diagnose which symmetry types dominate model errors on new observations.
  • The algebraic construction might be extended to approximate continuous symmetries by taking limits of finite subgroups.
  • The polynomial-time optimality guarantee could support stable compression of large simulation datasets that respect known symmetries.
  • The symmetry-discovery procedure could be applied to experimental measurements in other domains to surface previously unrecognized conserved quantities.

Load-bearing premise

The construction assumes that any finite group G can be used to define a tensor multiplication rule that makes equivariance an intrinsic algebraic property without further restrictions on how the resulting tensors interact with real data distributions or measurement noise.

What would settle it

If the T1-to-A1 predictive-power ratio on QM9 data fails to separate vector observables from scalar observables by a clear factor or if the ★_G-SVD approximation error on a symmetric tensor exceeds that of ordinary SVD, the optimality and data-driven recovery claims would be contradicted.

Figures

Figures reproduced from arXiv: 2605.20440 by Dongsung Huh, Haim Avron, Kenneth L. Clarkson, Lior Horesh, Misha Kilmer, Paulina Hoyos, Shashanka Ubaru, Vasileios Kalantzis.

Figure 1
Figure 1. Figure 1: The ⋆G tensor algebra: from optimal decomposition to symmetry discovery. (Top left, From molecules to algebra) Molecular data measured under all elements of a symmetry group G form a structured tensor A ∈ R n×d×|G| , preserving geometric information that is destroyed by vectorization into A ∈ R n·d×|G| . (Top right, The ⋆G product) Two tensors are multiplied via group convolution along the tube dimension, … view at source ↗
Figure 2
Figure 2. Figure 2: Synthetic validation (Z12). (a) Test R2 . (b) Rotation variance (log scale): 30-orders-of￾magnitude gap between ⋆G-SVD and all non-algebraic methods. (c) Parameter efficiency (Pareto frontier): ⋆G-SVD dominates all baselines on both axes simultaneously. (d) Multi-metric normalized summary scores. resolve these modes. This setting models the physical situation in which a molecular property depends on two st… view at source ↗
Figure 3
Figure 3. Figure 3: Predicted vs. true (synthetic). (a) ⋆G-SVD: perfect diagonal at R2 = 1.000. (b) Standard MLP: near-random scatter at R2 = −0.084, illustrating the cost of ignoring symmetry [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: QM9 HOMO–LUMO gap (1,000 real molecules). (a) Test R2 : ⋆G-SVD and Augmented MLP are the only methods with positive R2 ; all pure neural baselines overfit catastrophically. (b) RMSE (Hartree): ⋆G-SVD achieves the lowest error at 0.035 Ha. (c) Rotation variance (log scale): ⋆G-SVD achieves exact invariance at floating-point noise. a library of candidate groups and identify the one that maximally captures th… view at source ↗
Figure 5
Figure 5. Figure 5: Learning curves on QM9. ⋆G-SVD + Ridge maintains positive R2 from as few as 100 molecules. Neural baselines overfit at small sample sizes (negative R2 ) and require substantially more data to approach competitive performance. Bands: ±1 s.d. over 3 seeds [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Product group Z6 × Z4: compositional advantage. (a) Eight-method comparison. The product group achieves R2 = 1.000; each factor alone captures ≤ 23%. (b) 2D frequency map: coupled cells (red borders) carry 87% of target energy and are resolved only by FZ6 ⊗ FZ4 . Ablation: Removing Symmetry Components 0.114 0.155 0.229 0.986 1.000 Test R 2 0 0.2 0.4 0.6 0.8 1 No symmetry (MLP) G1 only (Z 6 ) G2 only (Z 4 )… view at source ↗
Figure 7
Figure 7. Figure 7: Ablation cascade. Progressively removing symmetry components reveals a strict performance hierarchy: product group → wrong cyclic approximation → single factor → no symmetry. l = 0 and l = 2 channels. We test whether the ⋆G framework recovers these selection rules from molecular geometry data alone. Setup We replace the cyclic group Z24 with the chiral octahedral group O (order 24, a subgroup of SO(3)) who… view at source ↗
Figure 8
Figure 8. Figure 8: Symmetry discovery. (a) QM9 group discovery: scanning candidate groups reveals Z4 as the best fit, consistent with C4 molecular symmetry. The combined score axis reflects both predictive accuracy and invariance quality. (b) Factorization discovery for order 24: Z3 ×Z8 is identified as the optimal decomposition (R2 = 1.000), surpassing cyclic Z24 (R2 = 0.985) and revealing the latent product structure of th… view at source ↗
Figure 9
Figure 9. Figure 9: Empirical recovery of Wigner–Eckart selection rules. Per-irrep predictive power (R2 ) for each quantum property. Scalar properties (blue shades) are dominated by A1 (l = 0). Dipole magnitude (orange) also lives primarily in A1 because |µ| is a scalar. Dipole vector components (red shades) show a qualitatively different pattern: A1 gives nearly zero while T1 (l= 1) is the dominant channel, consistent with t… view at source ↗
Figure 10
Figure 10. Figure 10: Irrep decomposition heatmap. R2 for each (property, irrep) combination, sorted by tensor rank with group separators. Rank-0 properties (above line) show strong A1 and weak T1; rank-1 dipole components (below line) show the reverse; the rank-2 polarizability is the outlier with near-zero T1. Matched-input-information comparison and the Augmented MLP collapse. On the same molecule￾level (nfeat, |G|) feature… view at source ↗
Figure 11
Figure 11. Figure 11: Parameter-efficiency vs predictive power on QM9 HOMO–LUMO gap. (a) Pooled test R2 vs trainable parameters (3 seeds, error bars are ±std). MACE occupies the upper-right (R2 = 0.985 at 945,168 parameters); ⋆G-SVD + Ridge occupies the upper-left at 144 parameters (R2 = 0.482, parameter efficiency ∼6,600× better than MACE). MLP-augmented sits at the bottom (R2 ≈ 0.02), illustrating the structural collapse of … view at source ↗
Figure 12
Figure 12. Figure 12: Paradigm comparison. Left (ENN paradigm): each symmetry requires a bespoke architecture; combining symmetries requires redesigning from scratch. Right (⋆G paradigm): the same algebra handles any group; composing symmetries requires only specifying G1 × G2 in the Fourier transform. algebraic construction, closing a circle between two theorems that, fittingly, share a common author in Carl Eckart. By changi… view at source ↗
Figure 13
Figure 13. Figure 13: Extended Data [PITH_FULL_IMAGE:figures/full_fig_p026_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Extended Data [PITH_FULL_IMAGE:figures/full_fig_p026_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Extended Data [PITH_FULL_IMAGE:figures/full_fig_p028_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Extended Data [PITH_FULL_IMAGE:figures/full_fig_p028_16.png] view at source ↗
read the original abstract

We introduce the $\star_G$ tensor algebra, in which any finite group $G$ defines the multiplication rule, making equivariance an intrinsic algebraic property rather than an architectural constraint. The framework rests on three machine-verified theoretical pillars: (i)~an Eckart-Young optimality guarantee for the $\star_G$-SVD: the first such result for symmetry-preserving tensor approximation, exact and polynomial-time; (ii)~a Kronecker factorization that composes multiple symmetries by replacing $F_G$ with $F_{G_1} \otimes F_{G_2}$ with no architectural redesign; and (iii)~a 600-line Lean~4 formalization of the $\star_G$ algebra. The framework provides capabilities that equivariant neural networks (ENNs) structurally cannot: a closed-form per-irreducible-representation decomposition of every prediction, and data-driven discovery of the symmetry group that best fits a dataset. As a non-trivial empirical demonstration, decomposing QM9 molecular geometry over the chiral octahedral subgroup of SO(3) recovers the Wigner--Eckart selection rules of angular momentum from data alone, with no quantum mechanical input: scalar properties are A$_1$-dominated, dipole components are T$_1$-dominated, the isotropic polarizability is uniquely insensitive to $l\!=\!1$ as the rank-2-trace decomposition $l\!=\!0 \oplus l\!=\!2$ requires, and the T$_1$/A$_1$ predictive-power ratio separates vector observables from scalar observables by a factor of five. On full QM9 (130{,}831 molecules), $\star_G$-SVD with ridge regression provides closed form predictions at $\sim50-90\times$ fewer parameters than parameter-matched MLPs. Algebraic equivariance thus complements architectural equivariance not as a faster-better-cheaper alternative but as a different mathematical affordance: provably-optimal symmetry-preserving compression, per-irrep interpretability, and data-driven physical discovery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the ★_G tensor algebra in which any finite group G defines a multiplication rule, rendering equivariance an intrinsic algebraic property. It rests on three machine-verified pillars: an Eckart-Young optimality guarantee for the ★_G-SVD (claimed to be the first such result for symmetry-preserving tensor approximation and polynomial-time), a Kronecker factorization for composing multiple symmetries via F_{G1} ⊗ F_{G2}, and a 600-line Lean 4 formalization of the algebra. Empirically, the framework is demonstrated on the QM9 dataset (130,831 molecules) by decomposing molecular geometries over the chiral octahedral subgroup of SO(3), recovering Wigner-Eckart selection rules (A1 dominance for scalars, T1 for dipoles, l=0⊕l=2 for rank-2 traces) from data alone with no quantum-mechanical input supplied, while also delivering closed-form ridge-regression predictions at 50-90× fewer parameters than parameter-matched MLPs.

Significance. If the central claims hold, the work supplies a distinct mathematical affordance that complements architectural equivariant neural networks: provably optimal symmetry-preserving compression, per-irreducible-representation interpretability, and data-driven physical symmetry discovery. The explicit machine-checked proofs (Eckart-Young guarantee, Kronecker factorization) and the 600-line Lean 4 formalization are concrete strengths that raise the bar for theoretical support in this area.

major comments (2)
  1. [QM9 demonstration] QM9 demonstration (full dataset results): the ridge-regression regularization parameter must be shown to have been selected by a procedure that does not depend on the reported performance numbers; otherwise the claimed separation of vector vs. scalar observables by a factor of five risks circularity.
  2. [Opening paragraphs and ★_G construction] Opening paragraphs and § on the ★_G construction: the claim that equivariance becomes an intrinsic algebraic property for any finite G is load-bearing, yet the manuscript does not address how the resulting tensor multiplication interacts with measurement noise or non-uniform data distributions; a concrete counter-example or robustness statement would be needed to support the discovery claim.
minor comments (2)
  1. [Empirical results] The parameter-count comparison (∼50-90× fewer than MLPs) would be clearer if presented in a table that lists exact hidden-dimension and parameter totals for each baseline.
  2. [Theoretical development] Notation for the ★_G product and the associated SVD should be introduced with a small worked example before the general theorems.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation and constructive comments. We address each major point below and have revised the manuscript accordingly to strengthen the presentation.

read point-by-point responses
  1. Referee: [QM9 demonstration] QM9 demonstration (full dataset results): the ridge-regression regularization parameter must be shown to have been selected by a procedure that does not depend on the reported performance numbers; otherwise the claimed separation of vector vs. scalar observables by a factor of five risks circularity.

    Authors: We agree that the regularization parameter selection must be independent of the final reported metrics. In the original experiments, λ was chosen via 5-fold cross-validation on a 10% held-out validation split drawn before any test-set evaluation; the test set was never used for tuning. We have added an explicit description of this procedure, including the validation split size, the grid of λ values, and confirmation that all performance numbers (including the factor-of-five separation) are computed on the untouched test set. This removes any circularity. revision: yes

  2. Referee: [Opening paragraphs and ★_G construction] Opening paragraphs and § on the ★_G construction: the claim that equivariance becomes an intrinsic algebraic property for any finite G is load-bearing, yet the manuscript does not address how the resulting tensor multiplication interacts with measurement noise or non-uniform data distributions; a concrete counter-example or robustness statement would be needed to support the discovery claim.

    Authors: The ★_G multiplication is defined algebraically on the tensor space and is therefore independent of how any particular tensor was generated; equivariance holds exactly for every input tensor, noisy or otherwise. Noise and sampling distribution affect only the empirical coefficients recovered by the decomposition, not the algebraic property itself. To address the discovery claim, the revision adds a short robustness paragraph together with a concrete example: a scalar tensor plus isotropic Gaussian noise of relative amplitude ε yields non-zero but O(ε) projections onto non-A1 irreps, while the A1 component remains dominant for ε ≲ 0.1. We also include a brief synthetic-noise experiment on QM9 confirming that the reported irrep dominance (A1 for scalars, T1 for vectors) persists at moderate noise levels. This clarifies the distinction between algebraic equivariance and empirical robustness without altering the central claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation chain is self-contained: the Eckart-Young guarantee for ★_G-SVD, Kronecker factorization, and the full ★_G algebra are each backed by an independent 600-line Lean 4 formalization that machine-checks the algebraic properties without reference to the QM9 results or any fitted parameters. The QM9 demonstration applies the already-verified decomposition to recover expected irrep dominance patterns from data alone and reports closed-form ridge-regression predictions; neither step redefines a quantity in terms of itself nor renames a fit as a prediction. No load-bearing claim reduces to a self-citation chain or to an ansatz smuggled via prior work by the same authors.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Based on abstract only; the framework rests on the definition of the star_G multiplication for finite groups and standard facts from representation theory. No explicit free parameters are named, though ridge regression is used in the QM9 experiments.

axioms (1)
  • domain assumption Any finite group G defines a tensor multiplication rule star_G that makes equivariance an intrinsic algebraic property.
    This is the foundational definition stated in the opening of the abstract.
invented entities (1)
  • star_G tensor algebra no independent evidence
    purpose: To encode symmetry as an algebraic multiplication rule rather than an architectural constraint.
    New structure introduced by the paper; no independent evidence outside the framework itself is provided in the abstract.

pith-pipeline@v0.9.0 · 5942 in / 1388 out tokens · 69852 ms · 2026-05-21T07:31:34.059938+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · 3 internal anchors

  1. [1]

    Kolda and Brett W

    Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications. SIAM Review, 51: 0 455--500, 2009

  2. [2]

    Sidiropoulos et al

    Nicholas D. Sidiropoulos et al. Tensor decomposition for signal processing and machine learning. IEEE Trans. Signal Process., 65: 0 3551--3582, 2017

  3. [3]

    Invariante V ariationsprobleme

    Emmy Noether. Invariante V ariationsprobleme. Nachr. Ges. Wiss. G\"ottingen, pages 235--257, 1918

  4. [4]

    Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

    Michael M. Bronstein, Joan Bruna, Taco Cohen, and Petar Veli c kovi\'c. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv:2104.13478, 2021

  5. [5]

    Group equivariant convolutional networks

    Taco Cohen and Max Welling. Group equivariant convolutional networks. In ICML, 2016

  6. [6]

    Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds

    Nathaniel Thomas et al. Tensor field networks. arXiv:1802.08219, 2018

  7. [7]

    SE (3)-transformers

    Fabian Fuchs et al. SE (3)-transformers. In NeurIPS, 2020

  8. [8]

    E (3)-equivariant graph neural networks for interatomic potentials

    Simon Batzner et al. E (3)-equivariant graph neural networks for interatomic potentials. Nat. Commun., 13: 0 2453, 2022

  9. [9]

    Sch\"utt et al

    Kristof T. Sch\"utt et al. SchNet . In NeurIPS, 2017

  10. [10]

    Highly accurate protein structure prediction with AlphaFold

    John Jumper et al. Highly accurate protein structure prediction with AlphaFold . Nature, 596: 0 583--589, 2021

  11. [11]

    Kilmer, Lior Horesh, Haim Avron, and Elizabeth Newman

    Misha E. Kilmer, Lior Horesh, Haim Avron, and Elizabeth Newman. Tensor-tensor products for optimal representation and compression. PNAS, 118: 0 e2015851118, 2021

  12. [12]

    Tensor--tensor products with invertible linear transforms

    Eric Kernfeld, Misha Kilmer, and Shuchin Aeron. Tensor--tensor products with invertible linear transforms. Linear Algebra Appl., 485: 0 545--570, 2015

  13. [13]

    Linear Representations of Finite Groups

    Jean-Pierre Serre. Linear Representations of Finite Groups. Springer, 1977

  14. [14]

    Die vollst\"andigkeit der primitiven darstellungen

    Fritz Peter and Hermann Weyl. Die vollst\"andigkeit der primitiven darstellungen. Math. Ann., 97: 0 737--755, 1927

  15. [15]

    The approximation of one matrix by another of lower rank

    Carl Eckart and Gale Young. The approximation of one matrix by another of lower rank. Psychometrika, 1: 0 211--218, 1936

  16. [16]

    Tensor rank and the ill-posedness of the best low-rank approximation problem

    Vin de Silva and Lek-Heng Lim. Tensor rank and the ill-posedness of the best low-rank approximation problem. SIAM J. Matrix Anal. Appl., 30: 0 1084--1127, 2008

  17. [17]

    Quantum chemistry structures and properties of 134 thousand molecules

    Raghunathan Ramakrishnan et al. Quantum chemistry structures and properties of 134 thousand molecules. Sci. Data, 1: 0 140022, 2014. doi:10.1038/sdata.2014.22

  18. [18]

    The Lean 4 theorem prover and programming language

    Leonardo de Moura and Sebastian Ullrich. The Lean 4 theorem prover and programming language. In CADE, 2021

  19. [19]

    The Lean mathematical library

    The mathlib Community . The Lean mathematical library. https://github.com/leanprover-community/mathlib4, 2020

  20. [20]

    Quasi tubal tensor algebra for separable groups

    Uria Mor and Haim Avron. Quasi tubal tensor algebra for separable groups. arXiv:2504.16231 preprint, 2025

  21. [21]

    Sufficient and Necessary Conditions for Eckart-Young like Result for Tubal Tensors

    Uria Mor. Sufficient and necessary conditions for an Eckart--Young theorem. arXiv:2512.24405 preprint, 2026

  22. [22]

    e3nn: Euclidean neural net- works,

    Mario Geiger and Tess Smidt. e3nn : E uclidean neural networks. https://github.com/e3nn/e3nn, 2022. arXiv:2207.09453

  23. [23]

    Ilyes Batatia, David Peter Kovacs, Gregor N. C. Simm, Christoph Ortner, and G\'abor Cs\'anyi. MACE : Higher order equivariant message passing neural networks for fast and accurate force fields. In Advances in Neural Information Processing Systems (NeurIPS), 2022