A catalog of fast matrix multiplication algorithms with frontier-closure search

Benoit Chatain Lacelle

arxiv: 2606.13408 · v1 · pith:AD3NTGA3new · submitted 2026-06-11 · 💻 cs.SC

A catalog of fast matrix multiplication algorithms with frontier-closure search

Benoit Chatain Lacelle This is my paper

Pith reviewed 2026-06-27 05:11 UTC · model grok-4.3

classification 💻 cs.SC

keywords fast matrix multiplicationbilinear complexitycatalogfrontier-closure searchKronecker productcommutative algorithmsalgorithm recombination

0 comments

The pith

A machine-checkable catalog collects fast matrix multiplication schemes up to 32x32x32 over five fields.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper assembles a single machine-verifiable collection of low-rank matrix multiplication schemes for all shapes through 32 cubed. It covers the rational, integer, real, complex, and binary fields, plus a commutative variant. The collection is grown by repeatedly applying a fixed set of recombination rules to existing entries. This separates the work of inventing new bilinear cores from the work of composing known cores. The result refreshes standard comparison tables for each field.

Core claim

We present a unified, machine-checkable catalog covering shapes up to 32x32x32 over Rationals, Integers, Reals, Complex, and F2, with a separate axis for commutative algorithms. Derivation over this catalog is performed by a frontier-closure search that recombines catalog entries by axis-flip, Kronecker, axis concatenation, serendipitous products, recombination-with-allocation, and downward projection under the non-overlap property that our recombination does not rediscover the shared bilinear products of hand-crafted constructions.

What carries the argument

The frontier-closure search, which grows the catalog by applying a closed set of recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection) while enforcing the non-overlap property.

If this is right

The catalog provides refreshed DIS09 comparison tables split per field and with a commutative column.
The non-overlap property draws a clean line between inventing new bilinear cores and composing known ones.
Attribution puzzles in the literature are resolved by separating core invention from composition.
Tooling allows automatic regeneration of the catalog and tables as new base schemes are added.
Systematic recombination discovers many new low-rank schemes across the covered fields.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Future searches could be directed to target only novel bilinear cores rather than recombinations.
The method supplies a practical way to check whether known constructions up to size 32 are complete.
Similar closure-based catalogs could be constructed for other algebraic operations that admit recombination rules.

Load-bearing premise

The listed recombination operations cannot rediscover the shared bilinear products of hand-crafted constructions such as Strassen, Laderman, Smirnov, and AlphaTensor.

What would settle it

An explicit sequence of recombination steps that produces the same bilinear products as a hand-crafted algorithm like Strassen's would falsify the non-overlap property.

Figures

Figures reproduced from arXiv: 2606.13408 by Benoit Chatain Lacelle.

read the original abstract

The 2022--2026 burst of activity in small-format matrix multiplication (AlphaTensor 2022, AlphaEvolve 2025, Schwartz--Zwecher 2025) has produced striking individual results but scattered them across different fields, attribution conventions, and serialisation formats. A complementary line of work -- Perminov's open-source flip-graph framework~\cite{perminov2026fast,perminov2025fast} -- instead drives existing construction methods, notably flip-graph and \emph{meta-flip-graph} search, at scale across large format spaces, discovering many new low-rank schemes (including ternary-integer ones) that further enrich the landscape this catalog must unify. We present a unified, machine-checkable catalog covering shapes up to \nmpshape{32}{32}{32} over \Rationals, \Integers, \Reals, \Complex, and \Ftwo, with a separate axis for commutative algorithms (Waksman 1970, Makarov 1986, Rosowski 2019). Derivation over this catalog is performed by a \emph{frontier-closure search} that recombines catalog entries by axis-flip, Kronecker, axis concatenation, serendipitous products, recombination-with-allocation (with optional output peeling and pair fusion), and downward projection. A central methodological point is the \emph{non-overlap property}: our recombination does not, and cannot, rediscover the shared bilinear products that hand-crafted constructions (Strassen, Laderman, Smirnov, AlphaTensor) are built around. This draws a clean line between the ``find a cleverer bilinear core'' and ``compose known cores'' axes of progress, and resolves several attribution puzzles in the literature. We refresh the DIS09 comparison tables, split per field and with a commutative column, and provide the tooling to regenerate them automatically as the catalog evolves.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a systematic recombination framework and catalog for small matrix mult algorithms that separates composition from core inventions, but the non-overlap claim is asserted without shown verification.

read the letter

The useful piece is the frontier-closure search that recombines known entries via axis flips, Kronecker products, concatenation, serendipitous products, and downward projection while claiming a non-overlap property that keeps those steps from rediscovering the bilinear cores of Strassen, Laderman, Smirnov, or AlphaTensor results. This is paired with a machine-checkable catalog up to 32x32x32 over rationals, integers, reals, complexes, F2, and a commutative variant, plus refreshed DIS09-style tables that can be regenerated automatically.

What the work does cleanly is pull together scattered results from AlphaTensor, Perminov's flip-graph searches, and earlier tables into one place with explicit axes and attribution rules. The methodological separation between finding new cores and composing existing ones is a reasonable way to reduce attribution confusion in the literature.

The soft spot is that the abstract states the non-overlap property as a methodological guarantee but supplies no argument, invariant, or example showing that the listed operations stay disjoint from the hand-crafted bilinear products for the claimed shapes and rings. No catalog entries or regenerated table snippets appear either, so the machine-checkable claim cannot be inspected from what is given. The full text would need to close that gap for the separation to count as demonstrated rather than modeled.

This is aimed at people already working inside algebraic complexity and practical kernel design who track small-format matrix multiplication. A reader who needs a reference list or wants to extend the composition rules would find it relevant if the verification holds. It deserves peer review because the fragmentation problem it targets is real and the proposed organizing tool is concrete enough to test.

Referee Report

1 major / 1 minor

Summary. The manuscript presents a unified, machine-checkable catalog of fast matrix multiplication algorithms for shapes up to 32×32×32 over Rationals, Integers, Reals, Complex, and F2 (with a separate commutative axis), constructed via frontier-closure search that recombines prior entries using axis-flip, Kronecker, axis concatenation, serendipitous products, recombination-with-allocation, and downward projection. A central claim is the non-overlap property asserting that these operations do not rediscover the bilinear cores of hand-crafted constructions such as Strassen, Laderman, Smirnov, and AlphaTensor, thereby separating composition-based progress from core-discovery progress; the work also refreshes DIS09 tables with tooling for automatic regeneration.

Significance. If the non-overlap property can be established and the catalog entries machine-verified, the work would be significant for the field by unifying scattered results across recent literature (AlphaTensor, Perminov flip-graph searches, etc.), supplying reproducible tooling to regenerate comparison tables, and clarifying attribution puzzles. The explicit provision of machine-checkable catalog entries and regeneration tooling is a clear strength.

major comments (1)

[Abstract] Abstract (central methodological point paragraph): The non-overlap property is asserted as a methodological fact guaranteeing that the listed recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection) cannot rediscover the shared bilinear products of Strassen, Laderman, Smirnov, or AlphaTensor constructions. No formal argument, invariant, exhaustive enumeration, or machine-checkable verification is supplied showing that this holds for every shape up to 32×32×32 and every coefficient ring. This property is load-bearing for the claim of drawing a clean line between the two axes of progress.

minor comments (1)

[Abstract] The abstract states that the catalog covers shapes up to 32×32×32 but does not indicate the section or appendix containing the actual catalog entries or the regenerated DIS09 tables.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and for recognizing the potential significance of the catalog and tooling. We address the single major comment below.

read point-by-point responses

Referee: [Abstract] Abstract (central methodological point paragraph): The non-overlap property is asserted as a methodological fact guaranteeing that the listed recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection) cannot rediscover the shared bilinear products of Strassen, Laderman, Smirnov, or AlphaTensor constructions. No formal argument, invariant, exhaustive enumeration, or machine-checkable verification is supplied showing that this holds for every shape up to 32×32×32 and every coefficient ring. This property is load-bearing for the claim of drawing a clean line between the two axes of progress.

Authors: We agree that the manuscript currently asserts the non-overlap property without an accompanying formal argument, invariant, or verification procedure. The operations are defined to act exclusively at the level of composition on existing bilinear algorithms; the search frontier is seeded only with base cases that deliberately exclude the cores of the cited hand-crafted constructions. To make the claim rigorous, the revised manuscript will add a dedicated subsection that states the preserved invariant for each operation and will include (or describe) machine-checkable verification code that, for every catalog entry up to 32×32×32 over the listed rings, confirms that the set of used bilinear products does not match those of Strassen, Laderman, Smirnov, or AlphaTensor. This directly addresses the load-bearing status of the property. revision: yes

Circularity Check

0 steps flagged

No circularity; catalog built by explicit recombination rules on external entries

full rationale

The derivation chain consists of a frontier-closure search that applies a fixed set of recombination operations (axis-flip, Kronecker product, concatenation, serendipitous products, recombination-with-allocation, downward projection) to catalog entries. The non-overlap property is stated as an asserted methodological fact about what those operations do, not as a quantity fitted inside the paper or obtained via self-citation load-bearing. No equation or claim reduces the catalog output to a parameter defined by the same catalog, and the cited prior work (Perminov et al.) is by different authors. The result is therefore self-contained against the machine-checkable recombination process described.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The construction rests on the domain assumption that the listed recombination operations preserve the non-overlap property with existing bilinear cores; no free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption The non-overlap property holds for the recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection).
Invoked as the central methodological point that separates composition from core invention.

pith-pipeline@v0.9.1-grok · 5881 in / 1374 out tokens · 32537 ms · 2026-06-27T05:11:13.829180+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 7 canonical work pages · 3 internal anchors

[1]

On symmetries of the Strassen algorithm

Vladimir P. Burichenko. On symmetries of the Strassen algorithm.arXiv:1408.6273, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[2]

de Groote

Hans F. de Groote. On varieties of optimal algorithms for the computation of bilinear mappings. ii. optimal algorithms for2 × 2-matrix multiplication.Theoretical Computer Science, 7(2):127–148, 1978

1978
[3]

Nazrul Islam, and Éric Schost

Charles-Éric Drevet, Md. Nazrul Islam, and Éric Schost. Optimization techniques for small matrix multiplication.Theoretical Computer Science, 412(22):2219–2236, 2011. Preprint dated 2009; published 2011

2011
[4]

A non-commutative algorithm for multiplying4 × 4matrices using 48 non-complex multiplications, 2025

Jean-Guillaume Dumas, Clément Pernet, and Alexandre Sedoglavic. A non-commutative algorithm for multiplying4 × 4matrices using 48 non-complex multiplications, 2025. arXiv:2506.13242

work page arXiv 2025
[5]

Discovering faster matrix multiplication algorithms with reinforcement learning.Nature, 610:47–53, 2022

Alhussein Fawzi et al. Discovering faster matrix multiplication algorithms with reinforcement learning.Nature, 610:47–53, 2022

2022
[6]

Hopcroft and Leslie R

John E. Hopcroft and Leslie R. Kerr. On minimizing the number of multiplications necessary for matrix multiplication.SIAM Journal on Applied Mathematics, 20(1):30–36, 1971

1971
[7]

Anwarul Islam

Md. Anwarul Islam. Bilinear algorithms for matrix multiplication and their applications. PhD thesis, University of Western Ontario, 2009

2009
[8]

Exploiting the Structure in Tensor Decompositions for Matrix Multiplication

Manuel Kauers, Jakob Moosbauer, and Isaac Wood. Exploiting the structure in tensor decompositions for matrix multiplication.arXiv preprint arXiv:2602.11041, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[9]

Laderman

Julian D. Laderman. A noncommutative algorithm for multiplying 3×3 matrices using 23 multiplications.Bulletin of the AMS, 82(1):126–128, 1976

1976
[10]

O. M. Makarov. An algorithm for multiplication of3× 3matrices.Zh. Vychisl. Mat. i Mat. Fiz., 26(2):293–294, 320, 1986

1986
[11]

Alphaevolve: a coding agent for scientific and algorithmic discovery,

Alexander Novikov et al. Alphaevolve: a coding agent for scientific and algorithmic discovery,
[12]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

Google DeepMind. arXiv:2506.13131

work page internal anchor Pith review Pith/arXiv arXiv
[13]

V. Ya. Pan. Strassen’s algorithm is not optimal – trilinear technique of aggregating, uniting and canceling for constructing fast algorithms for matrix operations. InProc. 19th Annual Symposium on Foundations of Computer Science (FOCS), pages 166–176, 1978

1978
[14]

V. Ya. Pan. New fast algorithms for matrix operations.SIAM Journal on Computing, 9(2):321–342, 1980. 40

1980
[15]

Perminov

Andrew I. Perminov. Fastmatrixmultiplication. https://github.com/dronperminov/ FastMatrixMultiplication. Accessed 2026-06

2026
[16]

Perminov

Andrew I. Perminov. Fast matrix multiplication via ternary meta flip graphs.arXiv preprint arXiv:2511.20317, 2025

work page arXiv 2025
[17]

Perminov

Andrew I. Perminov. Fast matrix multiplication in small formats: Discovering new schemes with an open-source flip graph framework.arXiv preprint arXiv:2603.02398, 2026

work page arXiv 2026
[18]

Fast commutative matrix algorithms.arXiv preprint arXiv:1904.07683, 2019

Andreas Rosowski. Fast commutative matrix algorithms.arXiv preprint arXiv:1904.07683, 2019

work page arXiv 1904
[19]

Partial and total matrix multiplication.SIAM Journal on Computing, 10(3):434–455, 1981

Arnold Schönhage. Partial and total matrix multiplication.SIAM Journal on Computing, 10(3):434–455, 1981

1981
[20]

Disjoint-sum constructions for fast matrix multiplication

Oded Schwartz and Eyal Zwecher. Disjoint-sum constructions for fast matrix multiplication. hal-05121550, preprint, 2025

2025
[21]

Fmm: Fast matrix multiplication catalog

Alexandre Sedoglavic et al. Fmm: Fast matrix multiplication catalog. https://fmm. univ-lille.fr/. Accessed 2026-06

2026
[22]

PhD thesis, Computing Center of the Russian Academy of Sciences, 1986

Alexey Vladimirovich Smirnov.Bilinear Complexity and Practical Algorithms for Matrix Multiplication. PhD thesis, Computing Center of the Russian Academy of Sciences, 1986

1986
[23]

Bilinear complexity and practical algorithms for matrix multiplication.Computational Mathematics and Mathematical Physics, 53(12):1781–1795, 2013

Alexey Vladimirovich Smirnov. Bilinear complexity and practical algorithms for matrix multiplication.Computational Mathematics and Mathematical Physics, 53(12):1781–1795, 2013

2013
[24]

Gaussian elimination is not optimal.Numerische Mathematik, 13(4):354–356, 1969

Volker Strassen. Gaussian elimination is not optimal.Numerische Mathematik, 13(4):354–356, 1969

1969
[25]

On winograd’s algorithm for inner products.IEEE Transactions on Computers, C-19(4):360–361, 1970

Abraham Waksman. On winograd’s algorithm for inner products.IEEE Transactions on Computers, C-19(4):360–361, 1970. 41

1970

[1] [1]

On symmetries of the Strassen algorithm

Vladimir P. Burichenko. On symmetries of the Strassen algorithm.arXiv:1408.6273, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[2] [2]

de Groote

Hans F. de Groote. On varieties of optimal algorithms for the computation of bilinear mappings. ii. optimal algorithms for2 × 2-matrix multiplication.Theoretical Computer Science, 7(2):127–148, 1978

1978

[3] [3]

Nazrul Islam, and Éric Schost

Charles-Éric Drevet, Md. Nazrul Islam, and Éric Schost. Optimization techniques for small matrix multiplication.Theoretical Computer Science, 412(22):2219–2236, 2011. Preprint dated 2009; published 2011

2011

[4] [4]

A non-commutative algorithm for multiplying4 × 4matrices using 48 non-complex multiplications, 2025

Jean-Guillaume Dumas, Clément Pernet, and Alexandre Sedoglavic. A non-commutative algorithm for multiplying4 × 4matrices using 48 non-complex multiplications, 2025. arXiv:2506.13242

work page arXiv 2025

[5] [5]

Discovering faster matrix multiplication algorithms with reinforcement learning.Nature, 610:47–53, 2022

Alhussein Fawzi et al. Discovering faster matrix multiplication algorithms with reinforcement learning.Nature, 610:47–53, 2022

2022

[6] [6]

Hopcroft and Leslie R

John E. Hopcroft and Leslie R. Kerr. On minimizing the number of multiplications necessary for matrix multiplication.SIAM Journal on Applied Mathematics, 20(1):30–36, 1971

1971

[7] [7]

Anwarul Islam

Md. Anwarul Islam. Bilinear algorithms for matrix multiplication and their applications. PhD thesis, University of Western Ontario, 2009

2009

[8] [8]

Exploiting the Structure in Tensor Decompositions for Matrix Multiplication

Manuel Kauers, Jakob Moosbauer, and Isaac Wood. Exploiting the structure in tensor decompositions for matrix multiplication.arXiv preprint arXiv:2602.11041, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[9] [9]

Laderman

Julian D. Laderman. A noncommutative algorithm for multiplying 3×3 matrices using 23 multiplications.Bulletin of the AMS, 82(1):126–128, 1976

1976

[10] [10]

O. M. Makarov. An algorithm for multiplication of3× 3matrices.Zh. Vychisl. Mat. i Mat. Fiz., 26(2):293–294, 320, 1986

1986

[11] [11]

Alphaevolve: a coding agent for scientific and algorithmic discovery,

Alexander Novikov et al. Alphaevolve: a coding agent for scientific and algorithmic discovery,

[12] [12]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

Google DeepMind. arXiv:2506.13131

work page internal anchor Pith review Pith/arXiv arXiv

[13] [13]

V. Ya. Pan. Strassen’s algorithm is not optimal – trilinear technique of aggregating, uniting and canceling for constructing fast algorithms for matrix operations. InProc. 19th Annual Symposium on Foundations of Computer Science (FOCS), pages 166–176, 1978

1978

[14] [14]

V. Ya. Pan. New fast algorithms for matrix operations.SIAM Journal on Computing, 9(2):321–342, 1980. 40

1980

[15] [15]

Perminov

Andrew I. Perminov. Fastmatrixmultiplication. https://github.com/dronperminov/ FastMatrixMultiplication. Accessed 2026-06

2026

[16] [16]

Perminov

Andrew I. Perminov. Fast matrix multiplication via ternary meta flip graphs.arXiv preprint arXiv:2511.20317, 2025

work page arXiv 2025

[17] [17]

Perminov

Andrew I. Perminov. Fast matrix multiplication in small formats: Discovering new schemes with an open-source flip graph framework.arXiv preprint arXiv:2603.02398, 2026

work page arXiv 2026

[18] [18]

Fast commutative matrix algorithms.arXiv preprint arXiv:1904.07683, 2019

Andreas Rosowski. Fast commutative matrix algorithms.arXiv preprint arXiv:1904.07683, 2019

work page arXiv 1904

[19] [19]

Partial and total matrix multiplication.SIAM Journal on Computing, 10(3):434–455, 1981

Arnold Schönhage. Partial and total matrix multiplication.SIAM Journal on Computing, 10(3):434–455, 1981

1981

[20] [20]

Disjoint-sum constructions for fast matrix multiplication

Oded Schwartz and Eyal Zwecher. Disjoint-sum constructions for fast matrix multiplication. hal-05121550, preprint, 2025

2025

[21] [21]

Fmm: Fast matrix multiplication catalog

Alexandre Sedoglavic et al. Fmm: Fast matrix multiplication catalog. https://fmm. univ-lille.fr/. Accessed 2026-06

2026

[22] [22]

PhD thesis, Computing Center of the Russian Academy of Sciences, 1986

Alexey Vladimirovich Smirnov.Bilinear Complexity and Practical Algorithms for Matrix Multiplication. PhD thesis, Computing Center of the Russian Academy of Sciences, 1986

1986

[23] [23]

Bilinear complexity and practical algorithms for matrix multiplication.Computational Mathematics and Mathematical Physics, 53(12):1781–1795, 2013

Alexey Vladimirovich Smirnov. Bilinear complexity and practical algorithms for matrix multiplication.Computational Mathematics and Mathematical Physics, 53(12):1781–1795, 2013

2013

[24] [24]

Gaussian elimination is not optimal.Numerische Mathematik, 13(4):354–356, 1969

Volker Strassen. Gaussian elimination is not optimal.Numerische Mathematik, 13(4):354–356, 1969

1969

[25] [25]

On winograd’s algorithm for inner products.IEEE Transactions on Computers, C-19(4):360–361, 1970

Abraham Waksman. On winograd’s algorithm for inner products.IEEE Transactions on Computers, C-19(4):360–361, 1970. 41

1970