A catalog of fast matrix multiplication algorithms with frontier-closure search
Pith reviewed 2026-06-27 05:11 UTC · model grok-4.3
The pith
A machine-checkable catalog collects fast matrix multiplication schemes up to 32x32x32 over five fields.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present a unified, machine-checkable catalog covering shapes up to 32x32x32 over Rationals, Integers, Reals, Complex, and F2, with a separate axis for commutative algorithms. Derivation over this catalog is performed by a frontier-closure search that recombines catalog entries by axis-flip, Kronecker, axis concatenation, serendipitous products, recombination-with-allocation, and downward projection under the non-overlap property that our recombination does not rediscover the shared bilinear products of hand-crafted constructions.
What carries the argument
The frontier-closure search, which grows the catalog by applying a closed set of recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection) while enforcing the non-overlap property.
If this is right
- The catalog provides refreshed DIS09 comparison tables split per field and with a commutative column.
- The non-overlap property draws a clean line between inventing new bilinear cores and composing known ones.
- Attribution puzzles in the literature are resolved by separating core invention from composition.
- Tooling allows automatic regeneration of the catalog and tables as new base schemes are added.
- Systematic recombination discovers many new low-rank schemes across the covered fields.
Where Pith is reading between the lines
- Future searches could be directed to target only novel bilinear cores rather than recombinations.
- The method supplies a practical way to check whether known constructions up to size 32 are complete.
- Similar closure-based catalogs could be constructed for other algebraic operations that admit recombination rules.
Load-bearing premise
The listed recombination operations cannot rediscover the shared bilinear products of hand-crafted constructions such as Strassen, Laderman, Smirnov, and AlphaTensor.
What would settle it
An explicit sequence of recombination steps that produces the same bilinear products as a hand-crafted algorithm like Strassen's would falsify the non-overlap property.
Figures
read the original abstract
The 2022--2026 burst of activity in small-format matrix multiplication (AlphaTensor 2022, AlphaEvolve 2025, Schwartz--Zwecher 2025) has produced striking individual results but scattered them across different fields, attribution conventions, and serialisation formats. A complementary line of work -- Perminov's open-source flip-graph framework~\cite{perminov2026fast,perminov2025fast} -- instead drives existing construction methods, notably flip-graph and \emph{meta-flip-graph} search, at scale across large format spaces, discovering many new low-rank schemes (including ternary-integer ones) that further enrich the landscape this catalog must unify. We present a unified, machine-checkable catalog covering shapes up to \nmpshape{32}{32}{32} over \Rationals, \Integers, \Reals, \Complex, and \Ftwo, with a separate axis for commutative algorithms (Waksman 1970, Makarov 1986, Rosowski 2019). Derivation over this catalog is performed by a \emph{frontier-closure search} that recombines catalog entries by axis-flip, Kronecker, axis concatenation, serendipitous products, recombination-with-allocation (with optional output peeling and pair fusion), and downward projection. A central methodological point is the \emph{non-overlap property}: our recombination does not, and cannot, rediscover the shared bilinear products that hand-crafted constructions (Strassen, Laderman, Smirnov, AlphaTensor) are built around. This draws a clean line between the ``find a cleverer bilinear core'' and ``compose known cores'' axes of progress, and resolves several attribution puzzles in the literature. We refresh the DIS09 comparison tables, split per field and with a commutative column, and provide the tooling to regenerate them automatically as the catalog evolves.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a unified, machine-checkable catalog of fast matrix multiplication algorithms for shapes up to 32×32×32 over Rationals, Integers, Reals, Complex, and F2 (with a separate commutative axis), constructed via frontier-closure search that recombines prior entries using axis-flip, Kronecker, axis concatenation, serendipitous products, recombination-with-allocation, and downward projection. A central claim is the non-overlap property asserting that these operations do not rediscover the bilinear cores of hand-crafted constructions such as Strassen, Laderman, Smirnov, and AlphaTensor, thereby separating composition-based progress from core-discovery progress; the work also refreshes DIS09 tables with tooling for automatic regeneration.
Significance. If the non-overlap property can be established and the catalog entries machine-verified, the work would be significant for the field by unifying scattered results across recent literature (AlphaTensor, Perminov flip-graph searches, etc.), supplying reproducible tooling to regenerate comparison tables, and clarifying attribution puzzles. The explicit provision of machine-checkable catalog entries and regeneration tooling is a clear strength.
major comments (1)
- [Abstract] Abstract (central methodological point paragraph): The non-overlap property is asserted as a methodological fact guaranteeing that the listed recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection) cannot rediscover the shared bilinear products of Strassen, Laderman, Smirnov, or AlphaTensor constructions. No formal argument, invariant, exhaustive enumeration, or machine-checkable verification is supplied showing that this holds for every shape up to 32×32×32 and every coefficient ring. This property is load-bearing for the claim of drawing a clean line between the two axes of progress.
minor comments (1)
- [Abstract] The abstract states that the catalog covers shapes up to 32×32×32 but does not indicate the section or appendix containing the actual catalog entries or the regenerated DIS09 tables.
Simulated Author's Rebuttal
We thank the referee for the careful review and for recognizing the potential significance of the catalog and tooling. We address the single major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract (central methodological point paragraph): The non-overlap property is asserted as a methodological fact guaranteeing that the listed recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection) cannot rediscover the shared bilinear products of Strassen, Laderman, Smirnov, or AlphaTensor constructions. No formal argument, invariant, exhaustive enumeration, or machine-checkable verification is supplied showing that this holds for every shape up to 32×32×32 and every coefficient ring. This property is load-bearing for the claim of drawing a clean line between the two axes of progress.
Authors: We agree that the manuscript currently asserts the non-overlap property without an accompanying formal argument, invariant, or verification procedure. The operations are defined to act exclusively at the level of composition on existing bilinear algorithms; the search frontier is seeded only with base cases that deliberately exclude the cores of the cited hand-crafted constructions. To make the claim rigorous, the revised manuscript will add a dedicated subsection that states the preserved invariant for each operation and will include (or describe) machine-checkable verification code that, for every catalog entry up to 32×32×32 over the listed rings, confirms that the set of used bilinear products does not match those of Strassen, Laderman, Smirnov, or AlphaTensor. This directly addresses the load-bearing status of the property. revision: yes
Circularity Check
No circularity; catalog built by explicit recombination rules on external entries
full rationale
The derivation chain consists of a frontier-closure search that applies a fixed set of recombination operations (axis-flip, Kronecker product, concatenation, serendipitous products, recombination-with-allocation, downward projection) to catalog entries. The non-overlap property is stated as an asserted methodological fact about what those operations do, not as a quantity fitted inside the paper or obtained via self-citation load-bearing. No equation or claim reduces the catalog output to a parameter defined by the same catalog, and the cited prior work (Perminov et al.) is by different authors. The result is therefore self-contained against the machine-checkable recombination process described.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The non-overlap property holds for the recombination operations (axis-flip, Kronecker, concatenation, serendipitous products, recombination-with-allocation, downward projection).
Reference graph
Works this paper leans on
-
[1]
On symmetries of the Strassen algorithm
Vladimir P. Burichenko. On symmetries of the Strassen algorithm.arXiv:1408.6273, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[2]
de Groote
Hans F. de Groote. On varieties of optimal algorithms for the computation of bilinear mappings. ii. optimal algorithms for2 × 2-matrix multiplication.Theoretical Computer Science, 7(2):127–148, 1978
1978
-
[3]
Nazrul Islam, and Éric Schost
Charles-Éric Drevet, Md. Nazrul Islam, and Éric Schost. Optimization techniques for small matrix multiplication.Theoretical Computer Science, 412(22):2219–2236, 2011. Preprint dated 2009; published 2011
2011
-
[4]
A non-commutative algorithm for multiplying4 × 4matrices using 48 non-complex multiplications, 2025
Jean-Guillaume Dumas, Clément Pernet, and Alexandre Sedoglavic. A non-commutative algorithm for multiplying4 × 4matrices using 48 non-complex multiplications, 2025. arXiv:2506.13242
-
[5]
Discovering faster matrix multiplication algorithms with reinforcement learning.Nature, 610:47–53, 2022
Alhussein Fawzi et al. Discovering faster matrix multiplication algorithms with reinforcement learning.Nature, 610:47–53, 2022
2022
-
[6]
Hopcroft and Leslie R
John E. Hopcroft and Leslie R. Kerr. On minimizing the number of multiplications necessary for matrix multiplication.SIAM Journal on Applied Mathematics, 20(1):30–36, 1971
1971
-
[7]
Anwarul Islam
Md. Anwarul Islam. Bilinear algorithms for matrix multiplication and their applications. PhD thesis, University of Western Ontario, 2009
2009
-
[8]
Exploiting the Structure in Tensor Decompositions for Matrix Multiplication
Manuel Kauers, Jakob Moosbauer, and Isaac Wood. Exploiting the structure in tensor decompositions for matrix multiplication.arXiv preprint arXiv:2602.11041, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[9]
Laderman
Julian D. Laderman. A noncommutative algorithm for multiplying 3×3 matrices using 23 multiplications.Bulletin of the AMS, 82(1):126–128, 1976
1976
-
[10]
O. M. Makarov. An algorithm for multiplication of3× 3matrices.Zh. Vychisl. Mat. i Mat. Fiz., 26(2):293–294, 320, 1986
1986
-
[11]
Alphaevolve: a coding agent for scientific and algorithmic discovery,
Alexander Novikov et al. Alphaevolve: a coding agent for scientific and algorithmic discovery,
-
[12]
AlphaEvolve: A coding agent for scientific and algorithmic discovery
Google DeepMind. arXiv:2506.13131
work page internal anchor Pith review Pith/arXiv arXiv
-
[13]
V. Ya. Pan. Strassen’s algorithm is not optimal – trilinear technique of aggregating, uniting and canceling for constructing fast algorithms for matrix operations. InProc. 19th Annual Symposium on Foundations of Computer Science (FOCS), pages 166–176, 1978
1978
-
[14]
V. Ya. Pan. New fast algorithms for matrix operations.SIAM Journal on Computing, 9(2):321–342, 1980. 40
1980
-
[15]
Perminov
Andrew I. Perminov. Fastmatrixmultiplication. https://github.com/dronperminov/ FastMatrixMultiplication. Accessed 2026-06
2026
- [16]
- [17]
-
[18]
Fast commutative matrix algorithms.arXiv preprint arXiv:1904.07683, 2019
Andreas Rosowski. Fast commutative matrix algorithms.arXiv preprint arXiv:1904.07683, 2019
-
[19]
Partial and total matrix multiplication.SIAM Journal on Computing, 10(3):434–455, 1981
Arnold Schönhage. Partial and total matrix multiplication.SIAM Journal on Computing, 10(3):434–455, 1981
1981
-
[20]
Disjoint-sum constructions for fast matrix multiplication
Oded Schwartz and Eyal Zwecher. Disjoint-sum constructions for fast matrix multiplication. hal-05121550, preprint, 2025
2025
-
[21]
Fmm: Fast matrix multiplication catalog
Alexandre Sedoglavic et al. Fmm: Fast matrix multiplication catalog. https://fmm. univ-lille.fr/. Accessed 2026-06
2026
-
[22]
PhD thesis, Computing Center of the Russian Academy of Sciences, 1986
Alexey Vladimirovich Smirnov.Bilinear Complexity and Practical Algorithms for Matrix Multiplication. PhD thesis, Computing Center of the Russian Academy of Sciences, 1986
1986
-
[23]
Bilinear complexity and practical algorithms for matrix multiplication.Computational Mathematics and Mathematical Physics, 53(12):1781–1795, 2013
Alexey Vladimirovich Smirnov. Bilinear complexity and practical algorithms for matrix multiplication.Computational Mathematics and Mathematical Physics, 53(12):1781–1795, 2013
2013
-
[24]
Gaussian elimination is not optimal.Numerische Mathematik, 13(4):354–356, 1969
Volker Strassen. Gaussian elimination is not optimal.Numerische Mathematik, 13(4):354–356, 1969
1969
-
[25]
On winograd’s algorithm for inner products.IEEE Transactions on Computers, C-19(4):360–361, 1970
Abraham Waksman. On winograd’s algorithm for inner products.IEEE Transactions on Computers, C-19(4):360–361, 1970. 41
1970
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.