arxiv: 2604.06107 · v1 · submitted 2026-04-07 · 💻 cs.AI · math.HO· math.LO

Recognition: 3 theorem links

· Lean Theorem

Artificial Intelligence and the Structure of Mathematics

Maissam Barkeshli, Michael H. Freedman, Michael R. Douglas

Authors on Pith no claims yet

Pith reviewed 2026-05-10 19:08 UTC · model grok-4.3

classification 💻 cs.AI math.HOmath.LO

keywords artificial intelligencemathematical proofsPlatonic worldsautomated discoveryproof hypergraphsfoundational structuremathematics and AI

0 comments

The pith

AI offers a complementary path to mathematical logic by revealing the global structure of formal proofs through automated exploration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that AI can move beyond solving isolated problems to mapping the overall architecture of mathematics. It sketches mathematics as networks of universal proofs and structural hypergraphs, then lists criteria AI models must meet to explore these networks autonomously. If such models succeed in traversing what the authors call Platonic mathematical worlds, they are expected to expose both large-scale patterns and the narrower ribbons of structure that humans can grasp. This route would address the old question of whether mathematics is discovered or invented by letting machines report on the terrain they encounter.

Core claim

The authors argue that AI agents satisfying criteria for automated mathematical discovery can traverse Platonic mathematical worlds represented via universal proofs and structural hypergraphs, thereby illuminating the global structure of formal proofs in a manner complementary to mathematical logic and clarifying both the overall nature of mathematics and the portions accessible to human understanding.

What carries the argument

Universal proof and structural hypergraphs, which encode the formal connections across all mathematical statements and allow systematic traversal by AI agents.

Load-bearing premise

AI models can be built that meet the stated criteria for automated discovery and will meaningfully traverse Platonic mathematical worlds to produce new insights about mathematical structure.

What would settle it

An AI system trained on large corpora of formal proofs that fails to identify any previously unrecognized global patterns or cross-proof dependencies beyond those already captured by standard logical analysis.

Figures

Figures reproduced from arXiv: 2604.06107 by Maissam Barkeshli, Michael H. Freedman, Michael R. Douglas.

**Figure 1.** Figure 1: Criteria for AMD 2 The hypergraphs of mathematics Our goal in this section is to briefly sketch the overarching structure of formal mathematics in terms of a series of hypergraphs. We consider it to be an important research program to further flesh out and make rigorous the discussion below. We subsequently use the hypergraphs to pose fundamental questions about the structure of mathematics and consider ho… view at source ↗

**Figure 2.** Figure 2: Figure generated by ChatGPT 35This is time-dependent, making it interesting to train an LLM only on mathematics up to time t and compare its discoveries with those of humans. 34 [PITH_FULL_IMAGE:figures/full_fig_p034_2.png] view at source ↗

**Figure 3.** Figure 3: The Iota-Red Hyperedge. The blue solid lines represent static construction (Application). The red dashed line represents the computational step. Because the second argument matches the constructor S, the graph explicitly links the expression a + S(b) to its reduction S(a + b). A.3 Example: The Doubling Function Following §1.9 of the HoTT Book, we define the operation double(n) = 2n using the recursion prin… view at source ↗

read the original abstract

Recent progress in artificial intelligence (AI) is unlocking transformative capabilities for mathematics. There is great hope that AI will help solve major open problems and autonomously discover new mathematical concepts. In this essay, we further consider how AI may open a grand perspective on mathematics by forging a new route, complementary to mathematical\textbf{ logic,} to understanding the global structure of formal \textbf{proof}\textbf{s}. We begin by providing a sketch of the formal structure of mathematics in terms of universal proof and structural hypergraphs and discuss questions this raises about the foundational structure of mathematics. We then outline the main ingredients and provide a set of criteria to be satisfied for AI models capable of automated mathematical discovery. As we send AI agents to traverse Platonic mathematical worlds, we expect they will teach us about the nature of mathematics: both as a whole, and the small ribbons conducive to human understanding. Perhaps they will shed light on the old question: "Is mathematics discovered or invented?" Can we grok the terrain of these \textbf{Platonic worlds}?

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a speculative essay sketching AI traversal of proof hypergraphs to reveal global math structure, but it offers no definitions, examples, or evidence to make the idea concrete.

read the letter

This essay from Barkeshli, Douglas, and Freedman sketches a vision where AI agents explore mathematics represented as structural hypergraphs of proofs, potentially teaching us about its global properties in ways that logic cannot. They list criteria for AI models that could do automated discovery and imagine these systems traversing Platonic worlds to address questions like whether math is discovered or invented. The piece is short and stays at the level of ideas rather than claims with support.

Referee Report

2 major / 2 minor

Summary. The manuscript is an essay arguing that recent AI progress offers a new route, complementary to mathematical logic, for understanding the global structure of formal proofs. It sketches mathematics via 'universal proofs' and 'structural hypergraphs,' outlines high-level criteria for AI models of automated discovery, and posits that AI agents traversing Platonic mathematical worlds will reveal insights into the nature of mathematics, including whether it is discovered or invented.

Significance. If the sketched framework and future AI capabilities are realized, the work could broaden foundational studies by integrating AI-driven exploration of proof structures, potentially yielding global insights inaccessible to logic alone and addressing philosophical questions about mathematics. The essay correctly identifies the prospective value of AI beyond problem-solving, but its significance is prospective and depends on operationalizing the proposed concepts.

major comments (2)

In the section providing a sketch of the formal structure of mathematics, the concepts of 'universal proof' and 'structural hypergraphs' are introduced without definitions of their nodes, edges, global properties, or any concrete example. This absence is load-bearing for the central claim that AI traversal will reveal new insights into proof structure, as there is no basis to distinguish such traversal from existing automated theorem proving.
In the section outlining the main ingredients and criteria for AI models capable of automated mathematical discovery, the criteria are presented only as high-level desiderata without formalization, mathematical specification, or a minimal worked example (e.g., for a single axiom system). This undermines the assertion that such models will teach us about the nature of mathematics in ways complementary to logic.

minor comments (2)

Abstract contains visible LaTeX commands such as boldface markers that should be properly rendered in the final version.
The manuscript would benefit from citations to related work in automated theorem proving, proof mining, and the philosophy of mathematical discovery to better situate the proposals.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which correctly identify the need for greater concreteness in our conceptual sketch. We address each point below and will revise the manuscript to incorporate definitions and examples while preserving the essay's exploratory character.

read point-by-point responses

Referee: In the section providing a sketch of the formal structure of mathematics, the concepts of 'universal proof' and 'structural hypergraphs' are introduced without definitions of their nodes, edges, global properties, or any concrete example. This absence is load-bearing for the central claim that AI traversal will reveal new insights into proof structure, as there is no basis to distinguish such traversal from existing automated theorem proving.

Authors: We agree that the manuscript introduces 'universal proof' and 'structural hypergraphs' at a conceptual level without formal definitions, node/edge specifications, global properties, or examples. This limits the ability to clearly differentiate the proposed AI-driven traversal from existing automated theorem proving. As the work is an essay outlining prospective ideas rather than a technical formalism, the presentation remained high-level. In revision we will add explicit definitions (nodes as statements or subproofs, edges as inference steps or dependency relations), describe global properties such as hypergraph connectivity and centrality, and include a concrete example based on a small system such as propositional logic to illustrate how AI traversal could yield structural insights beyond standard ATP. revision: yes
Referee: In the section outlining the main ingredients and criteria for AI models capable of automated mathematical discovery, the criteria are presented only as high-level desiderata without formalization, mathematical specification, or a minimal worked example (e.g., for a single axiom system). This undermines the assertion that such models will teach us about the nature of mathematics in ways complementary to logic.

Authors: The referee accurately observes that the criteria for AI models of automated discovery are stated as high-level desiderata without formalization or a minimal worked example. This choice aligns with the essay's goal of sketching broad requirements for future systems rather than providing an implemented specification. To better support the claim of complementarity to logic, we will revise the section to include a minimal worked example, such as applying the criteria to a simple axiom system like basic group theory or a fragment of Peano arithmetic, showing how the AI exploration could reveal structural or accessibility features not directly accessible through logical analysis alone. revision: yes

Circularity Check

0 steps flagged

No circularity: high-level essay with no derivations, equations, or self-referential reductions

full rationale

The paper is a conceptual essay sketching mathematics via universal proofs and structural hypergraphs, then outlining high-level criteria for AI discovery models. It contains no equations, no fitted parameters, no derivations, and no load-bearing self-citations. Claims about AI agents traversing Platonic worlds are presented as expectations and open questions rather than results derived from prior inputs or definitions. The criteria remain desiderata without formalization or reduction to the paper's own assumptions. This matches the reader's zero-circularity assessment; the argument is self-contained as philosophical speculation without any step that reduces by construction to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The essay relies on assumptions about future AI capabilities and introduces conceptual structures without independent evidence or prior derivation.

axioms (1)

domain assumption AI models can be constructed that meet criteria for autonomous mathematical discovery and traversal of proof structures.
Invoked when outlining main ingredients and criteria for AI models capable of automated discovery.

invented entities (2)

universal proof no independent evidence
purpose: To represent the formal structure of mathematics as a whole.
Introduced in the sketch of the formal structure of mathematics.
structural hypergraphs no independent evidence
purpose: To model the global structure of formal proofs.
Proposed as a way to discuss questions about the foundational structure of mathematics.

pith-pipeline@v0.9.0 · 5481 in / 1167 out tokens · 32097 ms · 2026-05-10T19:08:30.632170+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear
We begin by providing a sketch of the formal structure of mathematics in terms of universal proof and structural hypergraphs... AI agents to traverse Platonic mathematical worlds
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
efficiency E(P) := m(P)/l(P)... willow tree of mathematics... hubs and bottlenecks
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear
coarse equivalence... many Platonic worlds

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

NeuroClaw Technical Report
cs.CV 2026-04 unverdicted novelty 6.0

NeuroClaw introduces a three-tier multi-agent framework and NeuroBench benchmark that improve executability and reproducibility scores for neuroimaging tasks when used with multimodal LLMs.
Ablation and the Meno: Tools for Empirical Metamathematics
cs.LO 2026-04 unverdicted novelty 6.0

Meno and tactic ablation on Tao's Analysis I generate proof populations that embed on low one- or two-dimensional submanifolds far from human constructions in Goedel Prover space.
Astrolabe: A Content-Addressable Hypergraph for Semantic Knowledge Management
math.HO 2026-04 unverdicted novelty 5.0

Astrolabe is a content-addressable hypergraph for semantic knowledge management using hash-identified entries, arbitrary-width ordered references, and plugin-extensible records.

Reference graph

Works this paper leans on

88 extracted references · 44 canonical work pages · cited by 3 Pith papers · 2 internal anchors

[1]

Why philosophers should care about computational complexity.Com- plexity, 101:5

Scott Aaronson. Why philosophers should care about computational complexity.Com- plexity, 101:5
[2]

Abouzaid, A

Mohammed Abouzaid, Andrew J Blumberg, Martin Hairer, Joe Kileel, Tamara G Kolda, Paul D Nelson, Daniel Spielman, Nikhil Srivastava, Rachel Ward, Shmuel Wein- berger, et al. First proof.arXiv preprint arXiv:2602.05192, 2026

work page arXiv 2026
[3]

The Univalent Foundations Pro- gram Institute for Advanced Study, 2013

Peter Aczel, Benedikt Ahrens, Thorsten Altenkirch, Steve Awodey, Bruno Barras, An- drej Bauer, Yves Bertot, Marc Bezem, Thierry Coquand, Eric Finster, et al.Homotopy type theory: univalent foundations of mathematics. The Univalent Foundations Pro- gram Institute for Advanced Study, 2013

2013
[4]

Freedman, and Michael Mulligan

Vitaly Aksenov, Eve Bodnia, Michael H. Freedman, and Michael Mulligan. Compres- sion is all you need: Modeling mathematics. 2026. URLhttps://arxiv.org/abs/26 03.20396

2026
[5]

Mathematicians in the age of ai

Jeremy Avigad. Mathematicians in the age of ai. PDF, March 2026. URLhttps: //www.andrew.cmu.edu/user/avigad/Papers/mathematicians.pdf. Accessed 2026-03-04. 39

2026
[6]

Putnam2025: Our solution to putnam 2025 (axiomprover lean proofs)

Axiom Math. Putnam2025: Our solution to putnam 2025 (axiomprover lean proofs). https://github.com/AxiomMath/putnam2025, January 2026. GitHub repository (released Jan 2026). Accessed 2026-01-21

2025
[7]

Mlfmf: Data sets for machine learning for mathematical formalization

Andrej Bauer, Matej Petković, and Ljupčo Todorovski. Mlfmf: Data sets for machine learning for mathematical formalization. InAdvances in Neural Information Processing Systems (NeurIPS), volume 36, 2023. URLhttps://arxiv.org/abs/2310.16005

work page arXiv 2023
[8]

Courier Corporation, 2013

Rodney J Baxter.Exactly Solved Models in Statistical Mechanics. Courier Corporation, 2013

2013
[9]

Beckenbach

Edwin F. Beckenbach. Interesting integers.The American Mathematical Monthly, 52 (4):211, April 1945

1945
[10]

Machine learning and information theory concepts towards an AI mathematician.Bull

Yoshua Bengio and Nikolay Malkin. Machine learning and information theory concepts towards an AI mathematician.Bull. Amer. Math. Soc. (N.S.), 61(3):457–469, July
[11]

doi: 10.1090/bull/1839

work page doi:10.1090/bull/1839
[12]

na, 1988

Charles H Bennett.Logical depth and physical complexity. na, 1988

1988
[13]

Erdős problems.https://www.erdosproblems.com/

Thomas Bloom. Erdős problems.https://www.erdosproblems.com/. URLhttps: //www.erdosproblems.com/. Accessed: 2026-01-20

2026
[14]

Olausson, Lionel Wong, Gabriel Grand, Joshua B

Matthew Bowers, Theo X. Olausson, Lionel Wong, Gabriel Grand, Joshua B. Tenen- baum, KevinEllis, andArmandoSolar-Lezama. Top-DownSynthesisforLibraryLearn- ing.Proceedings of the ACM on Programming Languages, 7(POPL):41:1182–41:1213, January 2023. doi: 10.1145/3571234. URLhttps://dl.acm.org/doi/10.1145/357 1234

work page doi:10.1145/3571234 2023
[15]

Brenner, Vincent Cohen-Addad, and David Woodruff.Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery

Michael P. Brenner, Vincent Cohen-Addad, and David Woodruff. Solving an open problem in theoretical physics using ai-assisted discovery, 2026. URLhttps://arxiv. org/abs/2603.04735

work page arXiv 2026
[16]

The motivic class of the space of genus0maps to the flag variety.arXiv preprint arXiv:2601.07222, 2026

Jim Bryan, Balázs Elek, Freddie Manners, George Salafatinos, and Ravi Vakil. The motivic class of the space of genus0maps to the flag variety, 2026. URLhttps: //arxiv.org/abs/2601.07222

work page arXiv 2026
[17]

Samuel R. Buss. On gödel’s theorems on lengths of proofs i: Number of lines and speedup for arithmetics.Journal of Symbolic Logic, 59(3):737–756, 1994. doi: 10.230 7/2275906

1994
[18]

How deep neural networks learn compositional data: The random hierarchy model.Physical Review X, 14(3):031001, 2024

Francesco Cagnetta, Leonardo Petrini, Umberto M Tomasini, Alessandro Favero, and Matthieu Wyart. How deep neural networks learn compositional data: The random hierarchy model.Physical Review X, 14(3):031001, 2024

2024
[19]

Intrinsically motivated reinforcement learning.Advances in neural information processing systems, 17, 2004

Nuttapong Chentanez, Andrew Barto, and Satinder Singh. Intrinsically motivated reinforcement learning.Advances in neural information processing systems, 17, 2004

2004
[20]

Coester,Transposition is nearly optimal for IID list update, preprint, arXiv:2603.10244 (2026)

Christian Coester. Transposition is nearly optimal for iid list update, 2026. URL https://arxiv.org/abs/2603.10244

work page arXiv 2026
[21]

Springer Science & Business Media, 2012

Simon Colton.Automated theory formation in pure mathematics. Springer Science & Business Media, 2012. 40

2012
[22]

On the notion of interestingness in automatedmathematicaldiscovery.International Journal of Human-Computer Studies, 53(3):351–375, 2000

Simon Colton, Alan Bundy, and Toby Walsh. On the notion of interestingness in automatedmathematicaldiscovery.International Journal of Human-Computer Studies, 53(3):351–375, 2000. doi: 10.1006/ijhc.2000.0394. URLhttps://www.research.ed. ac.uk/files/408775/On_the_notion_of_interestingness_in_automated_mathe matical_discovery.pdf

work page doi:10.1006/ijhc.2000.0394 2000
[23]

Automatic invention of integer sequences

Simon Colton, Alan Bundy, and Toby Walsh. Automatic invention of integer sequences. InAAAI/IAAI, pages 558–563, 2000

2000
[24]

Coarse structure.Wikipedia, The Free Encyclopedia, 2025

Wikipedia contributors. Coarse structure.Wikipedia, The Free Encyclopedia, 2025. URLhttps://en.wikipedia.org/wiki/Coarse_structure

2025
[25]

Working with machines in mathematics.Bull

Alex Davies. Working with machines in mathematics.Bull. Amer. Math. Soc. (N.S.), 61(3):387–394, July 2024. doi: 10.1090/bull/1843

work page doi:10.1090/bull/1843 2024
[26]

Automated conjecturing with \emph{TxGraffiti}

Randy Davila. Automated conjecturing in mathematics with\emph{TxGraffiti}.arXiv preprint arXiv:2409.19379, 2024

work page internal anchor Pith review arXiv 2024
[27]

Graffiti3: Compact theory libraries for automated mathematical dis- covery.Research Square, 2026

Randy Davila. Graffiti3: Compact theory libraries for automated mathematical dis- covery.Research Square, 2026. doi: 10.21203/rs.3.rs-8493329/v1. Posted January 19, 2026

work page doi:10.21203/rs.3.rs-8493329/v1 2026
[28]

Z3: An efficient smt solver

Leonardo De Moura and Nikolaj Bjørner. Z3: An efficient smt solver. InInternational conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 337–340. Springer, 2008

2008
[29]

Proofs for a price: tomorrow’s ultra-rigorous mathematical culture

Silvia De Toffoli. Proofs for a price: tomorrow’s ultra-rigorous mathematical culture. Bull. Amer. Math. Soc. (N.S.), 61(3):395–410, July 2024. doi: 10.1090/bull/1823

work page doi:10.1090/bull/1823 2024
[30]

Morgan Kaufmann Publishers, San Francisco, CA, 2003

Rina Dechter.Constraint Processing. Morgan Kaufmann Publishers, San Francisco, CA, 2003. ISBN 978-1558608906

2003
[31]

Alephzero and mathematical experience.Bull

Simon DeDeo. Alephzero and mathematical experience.Bull. Amer. Math. Soc. (N.S.), 61(3):375–386, July 2024. doi: 10.1090/bull/1824

work page doi:10.1090/bull/1824 2024
[32]

Some history of the development of Graffiti

Ermelinda DeLaViña. Some history of the development of Graffiti. In Siemion Fajt- lowicz, Patrick W. Fowler, Pierre Hansen, Melvin F. Janowitz, and Fred S. Roberts, editors,Graphs and Discovery, volume 69 ofDIMACS Series in Discrete Mathemat- ics and Theoretical Computer Science, pages 81–118. American Mathematical Society, Providence, RI, 2005. doi: 10.1...

work page doi:10.1090/dimacs/069/06 2005
[33]

Dreamcoder: Growing generalizable, interpretable knowledge with wake-sleep bayesian program learning,

Kevin Ellis, Catherine Wong, Maxwell Nye, Mathias Sable-Meyer, Luc Cary, Lucas Morales, Luke Hewitt, Armando Solar-Lezama, and Joshua B. Tenenbaum. Dream- Coder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian pro- gram learning.arXiv:2006.08381 [cs], June 2020. URLhttp://arxiv.org/abs/2006 .08381. arXiv: 2006.08381

work page arXiv 2006
[34]

Standard models of arithmetic

Ali Enayat. Standard models of arithmetic. In Martin Kaså, editor,Idées Fixes: a Festschrift dedicated to Christian Bennet on the occasion of his 60th birthday, pages 55–64. University of Gothenburg Publications, Gothenburg, 2014. URLhttps://ww w.researchgate.net/publication/281936805_STANDARD_MODELS_OF_ARITHMETIC. Festschrift chapter. 41

work page arXiv 2014
[35]

Frontiermath: Open problems.https://epoch.ai/frontiermath/open-p roblems,

Epoch AI. Frontiermath: Open problems.https://epoch.ai/frontiermath/open-p roblems, . Accessed: 2026-01-27

2026
[36]

Frontiermath: Llm benchmark for advanced ai math reasoning

Epoch AI. Frontiermath: Llm benchmark for advanced ai math reasoning. Online, . URLhttps://epoch.ai/frontiermath. Accessed: 2026-01-25

2026
[37]

On conjectures of graffiti

Siemion Fajtlowicz. On conjectures of graffiti. InAnnals of discrete mathematics, volume 38, pages 113–118. Elsevier, 1988

1988
[38]

Aletheia tackles FirstProof autonomously.arXiv preprint arXiv:2602.21201, 2026

Tony Feng, Junehyuk Jung, Sang hyun Kim, Carlo Pagano, Sergei Gukov, Chiang- Chiang Tsai, David Woodruff, Adel Javanmard, Aryan Mokhtari, Dawsen Hwang, Yuri Chervonyi, Jonathan N. Lee, Garrett Bingham, Trieu H. Trinh, Vahab Mirrokni, Quoc V. Le, and Thang Luong. Aletheia tackles firstproof autonomously, 2026. URL https://arxiv.org/abs/2602.21201

work page arXiv 2026
[39]

Truth and speed-up.The Review of Symbolic Logic, 7(2):319–340,

Martin Fischer. Truth and speed-up.The Review of Symbolic Logic, 7(2):319–340,
[40]

Will machines change mathematics?Bull

Maia Fraser, Andrew Granville, Michael Harris, Colin McLarty, Emily Riehl, and Ak- shay Venkatesh. Will machines change mathematics?Bull. Amer. Math. Soc. (N.S.), 61(3):373–374, July 2024. doi: 10.1090/bull/1842

work page doi:10.1090/bull/1842 2024
[41]

Natural deduction via graphs: formal definition and computation rules.Mathematical Structures in Computer Science, 17:485–526, 2007

Herman Geuvers and Iris Loeb. Natural deduction via graphs: formal definition and computation rules.Mathematical Structures in Computer Science, 17:485–526, 2007. doi: 10.1017/S0960129507006123

work page doi:10.1017/s0960129507006123 2007
[42]

Lattice-valued bottleneck duality,

Robert Ghrist, Julian Gould, and Miguel Lopez. Lattice-valued bottleneck duality,
[43]

URLhttps://arxiv.org/abs/2410.00315

work page arXiv
[44]

Linear logic , journal =

Jean-Yves Girard. Linear logic.Theoretical Computer Science, 50:1–102, 1987. doi: 10.1016/0304-3975(87)90045-4

work page doi:10.1016/0304-3975(87)90045-4 1987
[45]

FrontierMath: A benchmark for evaluating advanced mathematical reasoning in AI.arXiv preprint arXiv:2411.04872, 2024

Elliot Glazer, Ege Erdil, Tamay Besiroglu, Diego Chicharro, Evan Chen, Alex Gunning, Caroline Falkman Olsson, Jean-Stanislas Denain, Anson Ho, Emily de Oliveira Santos, Olli Järviniemi, Matthew Barnett, Robert Sandler, Matej Vrzala, Jaime Sevilla, Qiuyu Ren, Elizabeth Pratt, Lionel Levine, Grant Barkley, Natalie Stewart, Bogdan Grechuk, Tetiana Grechuk, S...

work page arXiv 2024
[46]

Kac, Random walk and the theory of brownian motion, The American Ma- thematical Monthly54, 369 (1947), https://doi.org/10.1080/00029890.1947.11990189

Kurt Gödel. What is cantor’s continuum problem?The American Mathematical Monthly, 54(9):515–525, 1947. doi: 10.1080/00029890.1947.11991877

work page doi:10.1080/00029890.1947.11991877 1947
[47]

Stephen Jay Gould.Wonderful Life: The Burgess Shale and the Nature of History. W. W. Norton & Company, New York, 1989

1989
[48]

Olausson, Muxin Liu, Joshua B

Gabriel Grand, Lionel Wong, Matthew Bowers, Theo X. Olausson, Muxin Liu, Joshua B. Tenenbaum, and Jacob Andreas. LILO: Learning Interpretable Libraries by Compressing and Documenting Code, October 2023. URLhttp://arxiv.org/ab s/2310.19791. arXiv:2310.19791 [cs]. 42

work page arXiv 2023
[49]

Single-minus gluon tree amplitudes are nonzero,

Alfredo Guevara, Alexandru Lupsasca, David Skinner, Andrew Strominger, and Kevin Weil. Single-minus gluon tree amplitudes are nonzero, 2026. URLhttps://arxiv.or g/abs/2602.12176

work page arXiv 2026
[50]

Formal entity graphs as complex networks: Assessing centrality metrics of the archive of formal proofs

Fabian Huch. Formal entity graphs as complex networks: Assessing centrality metrics of the archive of formal proofs. InIntelligent Computer Mathematics: 15th International Conference, CICM 2022, Prague, Czech Republic, September 19–23, 2022, Proceedings, pages 151–166. Springer, 2022. doi: 10.1007/978-3-031-16681-5_10

work page doi:10.1007/978-3-031-16681-5_10 2022
[51]

Uijeong Jang and Ernest K. Ryu. Point convergence of nesterov’s accelerated gradient method: An ai-assisted proof, 2025. URLhttps://arxiv.org/abs/2510.23513

work page arXiv 2025
[52]

Conjecture synthesis for inductive theories.Journal of Automated Reasoning, 47(3):251–289, 2011

Moa Johansson, Lucas Dixon, and Alan Bundy. Conjecture synthesis for inductive theories.Journal of Automated Reasoning, 47(3):251–289, 2011

2011
[53]

Martin.Speech and Language Processing

Daniel Jurafsky and James H. Martin.Speech and Language Processing. Prentice Hall, Upper Saddle River, NJ, 2 edition, 2009. ISBN 978-0131873216

2009
[54]

Douglas B. Lenat. Automated theory formation in mathematics. InProceedings of the 5th International Joint Conference on Artificial Intelligence (IJCAI-77), pages 833– 842, Cambridge, MA, 1977. URLhttps://www.ijcai.org/Proceedings/77-2/Pap ers/061.pdf

1977
[55]

Why does deep and cheap learning work so well?Journal of Statistical Physics, 168(6):1223–1247, 2017

Henry W Lin, Max Tegmark, and David Rolnick. Why does deep and cheap learning work so well?Journal of Statistical Physics, 168(6):1223–1247, 2017

2017
[56]

Numina-lean-agent: An open and gen- eral agentic reasoning system for formal mathematics

Junqi Liu, Zihao Zhou, Zekai Zhu, Marco Dos Santos, Weikun He, Jiawei Liu, Ran Wang, Yunzhou Xie, Junqiao Zhao, Qiufeng Wang, Lihong Zhi, Jia Li, and Wenda Li. Numina-lean-agent: An open and general agentic reasoning system for formal mathematics, 2026. URLhttps://arxiv.org/abs/2601.14027

work page arXiv 2026
[57]

Advanced version of gemini with deep think officially achieves gold-medal standard at the international mathematical olympiad

Thang Luong and Edward Lockhart. Advanced version of gemini with deep think officially achieves gold-medal standard at the international mathematical olympiad. https://deepmind.google/discover/blog/advanced-version-of-gemini-with-d eep-think-officially-achieves-gold-medal-standard-at-the-international -mathematical-olympiad/, 7 2025. Google DeepMind; acce...

2025
[58]

Introducing gauss, an agent for autoformalization, 2025

Math, Inc. Introducing gauss, an agent for autoformalization, 2025. URLhttps: //www.math.inc/gauss. Announcement of Gauss; completion of the strong Prime Number Theorem formalization in Lean

2025
[59]

Sphere-packing-lean: Sphere packing in lean

Math, Inc. Sphere-packing-lean: Sphere packing in lean. GitHub repository, 2026. URL https://github.com/math-inc/Sphere-Packing-Lean. A Lean formalisation of sphere-packingoptimalityindimensions8and24(andperiodicuniquenessindimension 24). Accessed: 2026-03-05

2026
[60]

MathZero, The Classification Problem, and Set-Theoretic Type Theory.arXiv:2005.05512 [cs], May 2020

David McAllester. MathZero, The Classification Problem, and Set-Theoretic Type Theory.arXiv:2005.05512 [cs], May 2020. URLhttp://arxiv.org/abs/2005.05512. arXiv: 2005.05512

work page arXiv 2005
[61]

Single axioms for groups and abelian groups with various operations

William McCune. Single axioms for groups and abelian groups with various operations. Journal of Automated Reasoning, 10(1):1–13, 1993. 43

1993
[62]

Poincaré on the value of reasoning machines.Bull

Colin McLarty. Poincaré on the value of reasoning machines.Bull. Amer. Math. Soc. (N.S.), 61(3):411–422, July 2024. doi: 10.1090/bull/1822

work page doi:10.1090/bull/1822 2024
[63]

Google a.i

Cade Metz. Google a.i. system wins gold medal in international math olympiad.https: //www.nytimes.com/2025/07/21/technology/google-ai-international-mathema tics-olympiad.html, 7 2025. The New York Times; accessed 2025-10-20

2025
[64]

Mathematical discovery in the age of artificial intel- ligence.Nature Physics, pages 1–3, 2025

Bartosz Naskręcki and Ken Ono. Mathematical discovery in the age of artificial intel- ligence.Nature Physics, pages 1–3, 2025

2025
[65]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

Alexander Novikov, Ngân V˜ u, Marvin Eisenberger, Emilien Dupont, Po-Sen Huang, Adam Zsolt Wagner, Sergey Shirobokov, Borislav Kozlovskii, Francisco JR Ruiz, Abbas Mehrabian, et al. Alphaevolve: A coding agent for scientific and algorithmic discovery. arXiv preprint arXiv:2506.13131, 2025

work page internal anchor Pith review arXiv 2025
[66]

Automatedmathematicsandthereconfigurationofproofandlabor

RodrigoOchigame. Automatedmathematicsandthereconfigurationofproofandlabor. Bull. Amer. Math. Soc. (N.S.), 61(3):423–437, July 2024. doi: 10.1090/bull/1821

work page doi:10.1090/bull/1821 2024
[67]

Our first proof submissions

OpenAI. Our first proof submissions. OpenAI, February 2026. URLhttps://openai .com/index/first-proof-submissions/. Accessed: 2026-02-25

2026
[68]

Gabriel Poesia, David Broman, Nick Haber, and Noah D. Goodman. Learning Formal Mathematics From Intrinsic Motivation, June 2024. URLhttp://arxiv.org/abs/24 07.00695. arXiv:2407.00695 [cs]

work page arXiv 2024
[69]

Compositional sparsity of learnable functions.Bull

Tomaso Poggio and Maia Fraser. Compositional sparsity of learnable functions.Bull. Amer. Math. Soc. (N.S.), 61(3):438–456, July 2024. doi: 10.1090/bull/1820

work page doi:10.1090/bull/1820 2024
[70]

The lengths of proofs

Pavel Pudlák. The lengths of proofs. In Samuel R. Buss, editor,Handbook of Proof Theory, volume 137 ofStudies in Logic and the Foundations of Mathematics, pages 547–637. Elsevier–North-Holland, Amsterdam, 1998. URLhttps://users.math.cas .cz/~pudlak/length.pdf. See Theorem 7.2.2 (speed-up by any function provably total in the base theory) and Theorem 7.3.2

1998
[71]

Finite entropy sums in quantum field theory, 2025

Mark Van Raamsdonk. Finite entropy sums in quantum field theory, 2025. URL https://arxiv.org/abs/2508.21276

work page arXiv 2025
[72]

Mathematical discoveries from program search with large language models.Nature, 625(7995):468–475, 2024

Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M Pawan Kumar, Emilien Dupont, Francisco JR Ruiz, Jordan S Ellenberg, Pengming Wang, Omar Fawzi, et al. Mathematical discoveries from program search with large language models.Nature, 625(7995):468–475, 2024

2024
[73]

Learning hierarchical cate- gories in deep neural networks

Andrew M Saxe, James L McClellans, and Surya Ganguli. Learning hierarchical cate- gories in deep neural networks. InProceedings of the Annual Meeting of the Cognitive Science Society, volume 35, 2013

2013
[74]

Schwartz

Matthew D. Schwartz. Resummation of the c-parameter sudakov shoulder using effec- tive field theory, 2026. URLhttps://arxiv.org/abs/2601.02484

work page arXiv 2026
[75]

MIT press Cambridge, 1998

Richard S Sutton, Andrew G Barto, et al.Reinforcement learning: An introduction, volume 1. MIT press Cambridge, 1998. 44

1998
[76]

Mastodon post.https://mathstodon.xyz/@tao/115306424727150237, 10 2025

Terence Tao. Mastodon post.https://mathstodon.xyz/@tao/115306424727150237, 10 2025

work page arXiv 2025
[77]

Hilbert’stwenty-fourthproblem.The American Mathematical Monthly, 110(1):1–24, 2003

RüdigerThiele. Hilbert’stwenty-fourthproblem.The American Mathematical Monthly, 110(1):1–24, 2003. doi: 10.1080/00029890.2003.11919933. URLhttps://doi.org/10 .1080/00029890.2003.11919933

work page doi:10.1080/00029890.2003.11919933 2003
[78]

Putnambench leaderboard.https://trishullab.github.io/PutnamB ench/leaderboard.html

Trishul Lab. Putnambench leaderboard.https://trishullab.github.io/PutnamB ench/leaderboard.html. Webpage. Accessed 2026-01-21

2026
[79]

Putnambench: Evaluating neural theorem-provers on the putnam mathematical competition.CoRR, abs/2407.11214, 2024

George Tsoukalas, Jasper Lee, John Jennings, Jimmy Xin, Michelle Ding, Michael Jen- nings, Amitayush Thakur, and Swarat Chaudhuri. Putnambench: Evaluating neural theorem-provers on the putnam mathematical competition.https://arxiv.org/abs/ 2407.11214, 2024. Accessed 2026-01-21

work page arXiv 2024
[80]

Learning Interestingness in Automated Mathematical Theory Formation, November 2025

George Tsoukalas, Rahul Saha, Amitayush Thakur, Sabrina Reguyal, and Swarat Chaudhuri. Learning Interestingness in Automated Mathematical Theory Formation, November 2025. URLhttp://arxiv.org/abs/2511.14778. arXiv:2511.14778 [cs] version: 1

work page arXiv 2025

Showing first 80 references.