pith. sign in

arxiv: 2606.13408 · v1 · pith:AD3NTGA3new · submitted 2026-06-11 · 💻 cs.SC

A catalog of fast matrix multiplication algorithms with frontier-closure search

Pith reviewed 2026-06-27 05:11 UTC · model grok-4.3

classification 💻 cs.SC
keywords fast matrix multiplicationbilinear complexitycatalogfrontier-closure searchKronecker productcommutative algorithmsalgorithm recombination
0
0 comments X

The pith

A machine-checkable catalog collects fast matrix multiplication schemes up to 32x32x32 over five fields.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper assembles a single machine-verifiable collection of low-rank matrix multiplication schemes for all shapes through 32 cubed. It covers the rational, integer, real, complex, and binary fields, plus a commutative variant. The collection is grown by repeatedly applying a fixed set of recombination rules to existing entries. This separates the work of inventing new bilinear cores from the work of composing known cores. The result refreshes standard comparison tables for each field.

Core claim

We present a unified, machine-checkable catalog covering shapes up to 32x32x32 over Rationals, Integers, Reals, Complex, and F2, with a separate axis for commutative algorithms. Derivation over this catalog is performed by a frontier-closure search that recombines catalog entries by axis-flip, Kronecker, axis concatenation, serendipitous products, recombination-with-allocation, and downward projection under the non-overlap property that our recombination does not rediscover the shared bilinear products of hand-crafted constructions.

What carries the argument

The frontier-closure search, which grows the catalog by applying a closed set of recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection) while enforcing the non-overlap property.

If this is right

  • The catalog provides refreshed DIS09 comparison tables split per field and with a commutative column.
  • The non-overlap property draws a clean line between inventing new bilinear cores and composing known ones.
  • Attribution puzzles in the literature are resolved by separating core invention from composition.
  • Tooling allows automatic regeneration of the catalog and tables as new base schemes are added.
  • Systematic recombination discovers many new low-rank schemes across the covered fields.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Future searches could be directed to target only novel bilinear cores rather than recombinations.
  • The method supplies a practical way to check whether known constructions up to size 32 are complete.
  • Similar closure-based catalogs could be constructed for other algebraic operations that admit recombination rules.

Load-bearing premise

The listed recombination operations cannot rediscover the shared bilinear products of hand-crafted constructions such as Strassen, Laderman, Smirnov, and AlphaTensor.

What would settle it

An explicit sequence of recombination steps that produces the same bilinear products as a hand-crafted algorithm like Strassen's would falsify the non-overlap property.

Figures

Figures reproduced from arXiv: 2606.13408 by Benoit Chatain Lacelle.

Figure 1
Figure 1. Figure 1: The data of Table 3 as a curve: median [PITH_FULL_IMAGE:figures/full_fig_p021_1.png] view at source ↗
read the original abstract

The 2022--2026 burst of activity in small-format matrix multiplication (AlphaTensor 2022, AlphaEvolve 2025, Schwartz--Zwecher 2025) has produced striking individual results but scattered them across different fields, attribution conventions, and serialisation formats. A complementary line of work -- Perminov's open-source flip-graph framework~\cite{perminov2026fast,perminov2025fast} -- instead drives existing construction methods, notably flip-graph and \emph{meta-flip-graph} search, at scale across large format spaces, discovering many new low-rank schemes (including ternary-integer ones) that further enrich the landscape this catalog must unify. We present a unified, machine-checkable catalog covering shapes up to \nmpshape{32}{32}{32} over \Rationals, \Integers, \Reals, \Complex, and \Ftwo, with a separate axis for commutative algorithms (Waksman 1970, Makarov 1986, Rosowski 2019). Derivation over this catalog is performed by a \emph{frontier-closure search} that recombines catalog entries by axis-flip, Kronecker, axis concatenation, serendipitous products, recombination-with-allocation (with optional output peeling and pair fusion), and downward projection. A central methodological point is the \emph{non-overlap property}: our recombination does not, and cannot, rediscover the shared bilinear products that hand-crafted constructions (Strassen, Laderman, Smirnov, AlphaTensor) are built around. This draws a clean line between the ``find a cleverer bilinear core'' and ``compose known cores'' axes of progress, and resolves several attribution puzzles in the literature. We refresh the DIS09 comparison tables, split per field and with a commutative column, and provide the tooling to regenerate them automatically as the catalog evolves.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript presents a unified, machine-checkable catalog of fast matrix multiplication algorithms for shapes up to 32×32×32 over Rationals, Integers, Reals, Complex, and F2 (with a separate commutative axis), constructed via frontier-closure search that recombines prior entries using axis-flip, Kronecker, axis concatenation, serendipitous products, recombination-with-allocation, and downward projection. A central claim is the non-overlap property asserting that these operations do not rediscover the bilinear cores of hand-crafted constructions such as Strassen, Laderman, Smirnov, and AlphaTensor, thereby separating composition-based progress from core-discovery progress; the work also refreshes DIS09 tables with tooling for automatic regeneration.

Significance. If the non-overlap property can be established and the catalog entries machine-verified, the work would be significant for the field by unifying scattered results across recent literature (AlphaTensor, Perminov flip-graph searches, etc.), supplying reproducible tooling to regenerate comparison tables, and clarifying attribution puzzles. The explicit provision of machine-checkable catalog entries and regeneration tooling is a clear strength.

major comments (1)
  1. [Abstract] Abstract (central methodological point paragraph): The non-overlap property is asserted as a methodological fact guaranteeing that the listed recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection) cannot rediscover the shared bilinear products of Strassen, Laderman, Smirnov, or AlphaTensor constructions. No formal argument, invariant, exhaustive enumeration, or machine-checkable verification is supplied showing that this holds for every shape up to 32×32×32 and every coefficient ring. This property is load-bearing for the claim of drawing a clean line between the two axes of progress.
minor comments (1)
  1. [Abstract] The abstract states that the catalog covers shapes up to 32×32×32 but does not indicate the section or appendix containing the actual catalog entries or the regenerated DIS09 tables.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and for recognizing the potential significance of the catalog and tooling. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract (central methodological point paragraph): The non-overlap property is asserted as a methodological fact guaranteeing that the listed recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection) cannot rediscover the shared bilinear products of Strassen, Laderman, Smirnov, or AlphaTensor constructions. No formal argument, invariant, exhaustive enumeration, or machine-checkable verification is supplied showing that this holds for every shape up to 32×32×32 and every coefficient ring. This property is load-bearing for the claim of drawing a clean line between the two axes of progress.

    Authors: We agree that the manuscript currently asserts the non-overlap property without an accompanying formal argument, invariant, or verification procedure. The operations are defined to act exclusively at the level of composition on existing bilinear algorithms; the search frontier is seeded only with base cases that deliberately exclude the cores of the cited hand-crafted constructions. To make the claim rigorous, the revised manuscript will add a dedicated subsection that states the preserved invariant for each operation and will include (or describe) machine-checkable verification code that, for every catalog entry up to 32×32×32 over the listed rings, confirms that the set of used bilinear products does not match those of Strassen, Laderman, Smirnov, or AlphaTensor. This directly addresses the load-bearing status of the property. revision: yes

Circularity Check

0 steps flagged

No circularity; catalog built by explicit recombination rules on external entries

full rationale

The derivation chain consists of a frontier-closure search that applies a fixed set of recombination operations (axis-flip, Kronecker product, concatenation, serendipitous products, recombination-with-allocation, downward projection) to catalog entries. The non-overlap property is stated as an asserted methodological fact about what those operations do, not as a quantity fitted inside the paper or obtained via self-citation load-bearing. No equation or claim reduces the catalog output to a parameter defined by the same catalog, and the cited prior work (Perminov et al.) is by different authors. The result is therefore self-contained against the machine-checkable recombination process described.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The construction rests on the domain assumption that the listed recombination operations preserve the non-overlap property with existing bilinear cores; no free parameters or invented entities are described in the abstract.

axioms (1)
  • domain assumption The non-overlap property holds for the recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection).
    Invoked as the central methodological point that separates composition from core invention.

pith-pipeline@v0.9.1-grok · 5881 in / 1374 out tokens · 32537 ms · 2026-06-27T05:11:13.829180+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 7 canonical work pages · 3 internal anchors

  1. [1]

    On symmetries of the Strassen algorithm

    Vladimir P. Burichenko. On symmetries of the Strassen algorithm.arXiv:1408.6273, 2014

  2. [2]

    de Groote

    Hans F. de Groote. On varieties of optimal algorithms for the computation of bilinear mappings. ii. optimal algorithms for2 × 2-matrix multiplication.Theoretical Computer Science, 7(2):127–148, 1978

  3. [3]

    Nazrul Islam, and Éric Schost

    Charles-Éric Drevet, Md. Nazrul Islam, and Éric Schost. Optimization techniques for small matrix multiplication.Theoretical Computer Science, 412(22):2219–2236, 2011. Preprint dated 2009; published 2011

  4. [4]

    A non-commutative algorithm for multiplying4 × 4matrices using 48 non-complex multiplications, 2025

    Jean-Guillaume Dumas, Clément Pernet, and Alexandre Sedoglavic. A non-commutative algorithm for multiplying4 × 4matrices using 48 non-complex multiplications, 2025. arXiv:2506.13242

  5. [5]

    Discovering faster matrix multiplication algorithms with reinforcement learning.Nature, 610:47–53, 2022

    Alhussein Fawzi et al. Discovering faster matrix multiplication algorithms with reinforcement learning.Nature, 610:47–53, 2022

  6. [6]

    Hopcroft and Leslie R

    John E. Hopcroft and Leslie R. Kerr. On minimizing the number of multiplications necessary for matrix multiplication.SIAM Journal on Applied Mathematics, 20(1):30–36, 1971

  7. [7]

    Anwarul Islam

    Md. Anwarul Islam. Bilinear algorithms for matrix multiplication and their applications. PhD thesis, University of Western Ontario, 2009

  8. [8]

    Exploiting the Structure in Tensor Decompositions for Matrix Multiplication

    Manuel Kauers, Jakob Moosbauer, and Isaac Wood. Exploiting the structure in tensor decompositions for matrix multiplication.arXiv preprint arXiv:2602.11041, 2026

  9. [9]

    Laderman

    Julian D. Laderman. A noncommutative algorithm for multiplying 3×3 matrices using 23 multiplications.Bulletin of the AMS, 82(1):126–128, 1976

  10. [10]

    O. M. Makarov. An algorithm for multiplication of3× 3matrices.Zh. Vychisl. Mat. i Mat. Fiz., 26(2):293–294, 320, 1986

  11. [11]

    Alphaevolve: a coding agent for scientific and algorithmic discovery,

    Alexander Novikov et al. Alphaevolve: a coding agent for scientific and algorithmic discovery,

  12. [12]
  13. [13]

    V. Ya. Pan. Strassen’s algorithm is not optimal – trilinear technique of aggregating, uniting and canceling for constructing fast algorithms for matrix operations. InProc. 19th Annual Symposium on Foundations of Computer Science (FOCS), pages 166–176, 1978

  14. [14]

    V. Ya. Pan. New fast algorithms for matrix operations.SIAM Journal on Computing, 9(2):321–342, 1980. 40

  15. [15]

    Perminov

    Andrew I. Perminov. Fastmatrixmultiplication. https://github.com/dronperminov/ FastMatrixMultiplication. Accessed 2026-06

  16. [16]

    Perminov

    Andrew I. Perminov. Fast matrix multiplication via ternary meta flip graphs.arXiv preprint arXiv:2511.20317, 2025

  17. [17]

    Perminov

    Andrew I. Perminov. Fast matrix multiplication in small formats: Discovering new schemes with an open-source flip graph framework.arXiv preprint arXiv:2603.02398, 2026

  18. [18]

    Fast commutative matrix algorithms.arXiv preprint arXiv:1904.07683, 2019

    Andreas Rosowski. Fast commutative matrix algorithms.arXiv preprint arXiv:1904.07683, 2019

  19. [19]

    Partial and total matrix multiplication.SIAM Journal on Computing, 10(3):434–455, 1981

    Arnold Schönhage. Partial and total matrix multiplication.SIAM Journal on Computing, 10(3):434–455, 1981

  20. [20]

    Disjoint-sum constructions for fast matrix multiplication

    Oded Schwartz and Eyal Zwecher. Disjoint-sum constructions for fast matrix multiplication. hal-05121550, preprint, 2025

  21. [21]

    Fmm: Fast matrix multiplication catalog

    Alexandre Sedoglavic et al. Fmm: Fast matrix multiplication catalog. https://fmm. univ-lille.fr/. Accessed 2026-06

  22. [22]

    PhD thesis, Computing Center of the Russian Academy of Sciences, 1986

    Alexey Vladimirovich Smirnov.Bilinear Complexity and Practical Algorithms for Matrix Multiplication. PhD thesis, Computing Center of the Russian Academy of Sciences, 1986

  23. [23]

    Bilinear complexity and practical algorithms for matrix multiplication.Computational Mathematics and Mathematical Physics, 53(12):1781–1795, 2013

    Alexey Vladimirovich Smirnov. Bilinear complexity and practical algorithms for matrix multiplication.Computational Mathematics and Mathematical Physics, 53(12):1781–1795, 2013

  24. [24]

    Gaussian elimination is not optimal.Numerische Mathematik, 13(4):354–356, 1969

    Volker Strassen. Gaussian elimination is not optimal.Numerische Mathematik, 13(4):354–356, 1969

  25. [25]

    On winograd’s algorithm for inner products.IEEE Transactions on Computers, C-19(4):360–361, 1970

    Abraham Waksman. On winograd’s algorithm for inner products.IEEE Transactions on Computers, C-19(4):360–361, 1970. 41