Tao's Equational Proof Challenge Accepted (Technical Report)

Jasmin Blanchette; Lydia Kondylidou; Marijn J.H. Heule

arxiv: 2605.21200 · v1 · pith:4ZRZX3ZWnew · submitted 2026-05-20 · 💻 cs.LO

Tao's Equational Proof Challenge Accepted (Technical Report)

Lydia Kondylidou , Jasmin Blanchette , Marijn J.H. Heule This is my paper

Pith reviewed 2026-05-21 01:34 UTC · model grok-4.3

classification 💻 cs.LO

keywords proof minimizationequational proofsrewrite stepsbrute force searchheuristicssuperposition proverautomated theorem provingproof length reduction

0 comments

The pith

New minimization tool reduces 62-step equational proof to 20 rewrite steps.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a proof minimization tool to find alternatives to long proofs in equational reasoning. It takes a 62-step proof from a superposition prover and shortens it to a 20-step proof using only rewrite steps through brute force and heuristics. This matters to a sympathetic reader because shorter proofs are simpler to understand and verify, advancing the reliability of computer-assisted mathematics. The tool is shown to work effectively on a collection of 1431 similar problems, including one reduced from 151 steps to 10. Overall, it demonstrates a practical way to improve upon initial proof outputs from automated systems.

Core claim

Using a combination of brute force and heuristics, and exploiting both a superposition prover and an equational prover, the tool reduces the 62-step proof to 20 steps, each corresponding to a rewrite. In an empirical evaluation, it also performs well on 1431 equational problems originating from the same project, reducing in particular a 151-step proof to only 10 steps.

What carries the argument

The proof minimization tool that combines brute-force search and heuristics to exploit two equational provers and generate shorter rewrite-based proofs.

If this is right

The 62-step proof can be replaced by a much shorter sequence of 20 rewrites.
Many equational problems admit significantly shorter proofs than those initially found by provers.
Rewrite steps provide a simple and verifiable form for equational derivations.
The approach scales to large numbers of similar problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This suggests that automated proof search often produces proofs with redundant steps that can be pruned.
Shorter proofs could facilitate better human understanding of automated mathematical discoveries.
The technique might generalize to minimization in other proof systems.

Load-bearing premise

The heuristics and brute-force search in the tool are assumed to be sufficient to discover near-minimal proofs without systematic omission of shorter alternatives.

What would settle it

An independent search or manual construction that produces a proof with fewer than 20 steps for the original equational statement would show that further reduction is possible.

read the original abstract

In the context of the Equational Theories Project, Terence Tao posed the challenge of finding alternatives to a complicated 62-step proof found by the Vampire superposition prover. We introduce a proof minimization tool called Krympa. Using a combination of brute force and heuristics, and exploiting both Vampire and the Twee equational prover, the tool reduces the 62-step proof to 20 steps, each corresponding to a rewrite. In an empirical evaluation, it also performs well on 1431 equational problems originating from the same project, reducing in particular a 151-step proof to only 10 steps.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Krympa cuts the 62-step Vampire proof to 20 rewrite steps on the Tao challenge and shows similar gains on 1431 other problems, but the outputs lack explicit steps or replayable certificates.

read the letter

The punchline here is that the authors have developed a tool called Krympa which successfully minimizes a 62-step proof generated by Vampire down to 20 steps for Terence Tao's equational challenge, and they demonstrate its effectiveness across a large collection of similar problems. What is new is the tool itself, which uses a mix of brute force and heuristics while calling on both Vampire and the Twee equational prover. The paper reports specific reductions, such as the one from 62 to 20 steps, and an even larger one from 151 to 10 steps on another instance. This is grounded in the ongoing Equational Theories Project, where shorter proofs can make a real difference for manual inspection and further automation. The paper does well in its empirical evaluation. Running the tool on 1431 problems gives a broader picture than just the single Tao challenge. It shows consistent improvements without claiming optimality in every case. That said, there is a noticeable gap in the presentation of the main result. The 20-step proof is not given explicitly, nor is there a machine-checkable certificate or replay script that would let someone else confirm each rewrite step is sound. This leaves open the possibility of a miscount or an invalid step slipping through the minimization process. The description of how the two provers are combined could also be expanded to make the method easier to reproduce. Overall, this is a solid technical report for specialists in automated theorem proving. Readers who work on proof length minimization or who contribute to projects like the one Tao is involved in will find the numbers and the tool description relevant. It is not a foundational theoretical advance, but the practical results are clear enough to merit attention. I recommend sending this to peer review. The work has enough substance and novelty in its application to deserve referee comments, particularly on strengthening the verifiability of the minimized proofs.

Referee Report

2 major / 2 minor

Summary. The paper introduces Krympa, a proof-minimization tool that combines brute-force search and heuristics with the Vampire superposition prover and the Twee equational prover. It claims to reduce a 62-step Vampire proof for Terence Tao's equational challenge to a valid 20-step proof consisting entirely of rewrite steps, and reports analogous length reductions across 1431 equational problems drawn from the Equational Theories Project, including one reduction from 151 steps to 10 steps.

Significance. If the reported reductions are sound and independently verifiable, the work would demonstrate a practical approach to shortening equational proofs that could aid insight and automation in formal mathematics. The scale of the empirical evaluation on 1431 instances is a positive feature, as is the explicit reuse of established tools (Vampire and Twee) rather than a wholly new prover.

major comments (2)

[Description of the Tao challenge result] The central claim that Krympa produces a valid 20-step rewrite proof for the Tao challenge is not accompanied by the explicit sequence of 20 rewrite steps or by a machine-checkable certificate (e.g., a Twee or Vampire proof object) that each step is a sound equational replacement. Without this artifact, the reported length reduction cannot be distinguished from an internal counting or validity error in the minimization pipeline.
[Empirical evaluation section] The empirical evaluation reports successful reductions on 1431 problems but supplies neither error bars, a breakdown of verification method for each minimized proof, nor any independent replay data. This omission makes it impossible to assess the reliability of the claimed minimality or soundness across the test suite.

minor comments (2)

[Tool description] The description of the heuristics and brute-force search strategy inside Krympa is high-level; a pseudocode outline or parameter settings would improve reproducibility.
[Results presentation] Table or figure captions that list the original and minimized proof lengths for the 1431 instances would make the scale of the evaluation easier to assess at a glance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and the recommendation for major revision. We address each major comment below, indicating where revisions will be made to strengthen verifiability while preserving the core contributions of the work.

read point-by-point responses

Referee: The central claim that Krympa produces a valid 20-step rewrite proof for the Tao challenge is not accompanied by the explicit sequence of 20 rewrite steps or by a machine-checkable certificate (e.g., a Twee or Vampire proof object) that each step is a sound equational replacement. Without this artifact, the reported length reduction cannot be distinguished from an internal counting or validity error in the minimization pipeline.

Authors: We agree that the absence of the explicit 20-step sequence and certificate limits independent verification. The minimization pipeline invokes Twee to generate and validate each rewrite step, inheriting soundness from Twee's equational reasoning. In the revised manuscript we will append the complete sequence of 20 rewrite steps for the Tao challenge together with the corresponding Twee proof object (or a link to a supplementary archive containing it). This addition will allow direct replay and confirmation that the reduction from the original 62-step Vampire proof is both valid and minimal. revision: yes
Referee: The empirical evaluation reports successful reductions on 1431 problems but supplies neither error bars, a breakdown of verification method for each minimized proof, nor any independent replay data. This omission makes it impossible to assess the reliability of the claimed minimality or soundness across the test suite.

Authors: We acknowledge that the current presentation does not detail the verification procedure or supply replay artifacts. All minimized proofs were obtained by invoking Twee on the output of Krympa's search; Twee then confirms that each step is a sound equational replacement and that the final proof derives the goal. In revision we will add a dedicated subsection describing this replay-based verification, report aggregate statistics (including reduction ratios and success rates), and note that the full set of minimized proofs and Twee certificates will be deposited in a public repository. Individual replay data for every one of the 1431 instances cannot be reproduced in the main text for space reasons, but the repository will enable independent checking. revision: partial

Circularity Check

0 steps flagged

Empirical tool outputs on external problems are self-contained with no circular reduction

full rationale

The manuscript describes an algorithmic tool (Krympa) that invokes Vampire and Twee for search and rewriting, then reports the lengths of the resulting proof objects on externally supplied equational problems. The 62-to-20-step reduction and the 151-to-10-step reduction are direct computational outputs rather than quantities derived by definition or by fitting parameters to the same data. No equation, uniqueness theorem, or ansatz is shown to be justified solely by a self-citation that itself depends on the target result. The derivation chain therefore consists of independent algorithmic steps whose correctness can be checked by re-running the tool or inspecting the produced rewrite sequences.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the correctness of the underlying Vampire and Twee provers and on the assumption that the search procedure finds valid shorter proofs.

axioms (1)

domain assumption Vampire and Twee produce sound equational proofs
The minimization tool depends on the input proofs being valid.

invented entities (1)

Krympa no independent evidence
purpose: Proof minimization tool combining brute force and heuristics
New software artifact introduced by the authors.

pith-pipeline@v0.9.0 · 5625 in / 1066 out tokens · 38990 ms · 2026-05-21T01:34:02.443130+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

[1]

In: Merz, S., Nipkow, T

Amjad, H.: Compressing propositional refutations. In: Merz, S., Nipkow, T. (eds.) AVoCS 2006. Electronic Notes in Theoretical Computer Science, vol. 185, pp. 3–15. Elsevier (2006)

work page 2006
[2]

In: Aït-Kaci, H., Nivat, M

Bachmair, L., Dershowitz, N., Plaisted, D.A.: Completion without failure. In: Aït-Kaci, H., Nivat, M. (eds.) Rewriting Techniques, pp. 1–30. Academic Press (1989)

work page 1989
[3]

In: Kirchner, C., Kirchner, H

Bachmair, L., Ganzinger, H.: Strict basic superposition. In: Kirchner, C., Kirchner, H. (eds.) CADE 1998. LNCS, vol. 1421, pp. 160–174. Springer (1998)

work page 1998
[4]

In: Piskac, R., Rakamaric, Z

Bártek, F., Bhayat, A., Coutelier, R., Hajdú, M., Hetzenberger, M., Hozzová, P., Kovács, L., Rath, J., Rawson, M., Reger, G., Suda, M., Schoisswohl, J., Voronkov, A.: The Vampire diary. In: Piskac, R., Rakamaric, Z. (eds.) CAV 2025. LNCS, vol. 15933, pp. 57–71. Springer (2025) Tao’s Equational Proof Challenge Accepted (Technical Report) 19

work page 2025
[5]

Blanchette, J.C., Böhme, S., Fleury, M., Smolka, S.J., Steckermeier, A.: Semi-intelligible Isar proofs from machine-generated proofs. J. Automated Reas. 56, 155–200 (2016)

work page 2016
[6]

Bolan, M., Breitner, J., Brox, J., Carlini, N., Carneiro, M., van Doorn, F., Dvorak, M., Goens, A., Hill, A., Husum, H., Mejia, H.I., Kocsis, Z.A., Floch, B.L., Bar-on, A.L., Luccioli, L., McNeil, D., Meiburg, A., Monticone, P., Nielsen, P., Osazuwa, E.O., Paolini, G., Petracci, M., Reinke, B., Renshaw, D., Rossel, M., Roux, C., Scanvic, J., Srinivas, S.,...

work page 2025
[7]

Buch, A., Hillenbrand, T.: WALDMEISTER: Development of a high performance completion-based theorem prover. Tech. rep. (1996)

work page 1996
[8]

In: Bertot, Y., Kutsia, T., Norrish, M

Clune, J., Qian, Y., Bentkamp, A., Avigad, J.: Duper: A proof-producing superposition theorem prover for dependent type theory. In: Bertot, Y., Kutsia, T., Norrish, M. (eds.) ITP 2024. LIPIcs, vol. 309, pp. 1–20. Leibniz-Zentrum für Informatik (2024)

work page 2024
[9]

In: Strichman, O., Szeider, S

Cotton, S.: Two techniques for minimizing resolution proofs. In: Strichman, O., Szeider, S. (eds.) SAT 2010. LNCS, vol. 6175, pp. 306–312. Springer (2010)

work page 2010
[10]

In: Hong, H

Denzinger, J., Schulz, S.: Recording, analyzing and presenting distributed deduction processes. In: Hong, H. (ed.) PASCO 1994. Lecture Notes Series in Computing, vol. 5, pp. 114–123. World Scientific (1994)

work page 1994
[11]

Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)

Floch, B.L.: Zulip post in “Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)”. Zulip (2025), https://leanprover.zulipchat.com/#narrow/channel/219941-Machine- Learning-for-Theorem-Proving/topic/A.20.28semi.29- autoformalization.20challenge.3A.20650.3D.3E448/near/518970125

work page arXiv 2025
[12]

Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)

Floch, B.L.: Zulip post in “Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)”. Zulip (2025), https://leanprover.zulipchat.com/#narrow/channel/219941-Machine- Learning-for-Theorem-Proving/topic/A.20.28semi.29- autoformalization.20challenge.3A.20650.3D.3E448/near/568313777

work page arXiv 2025
[13]

In: Benzmüller, C., Heule, M., Schmidt, R

Geoff Sutcliffe: Stepping stones in the TPTP World. In: Benzmüller, C., Heule, M., Schmidt, R. (eds.) IJCAR 2024. pp. 30–50. LNCS (2024)

work page 2024
[14]

Gu, A., Piotrowski, B., Gloeckle, F., Yang, K., Markosyan, A.H.: ProofOptimizer: Training language models to simplify proofs without human demonstrations (2025),https://arxiv.org/abs/2510.15700

work page arXiv 2025
[15]

In: Barrett, C.W., Waldmann, U

Hajdu, M., Coutelier, R., Kovács, L., Voronkov, A.: Term ordering diagrams. In: Barrett, C.W., Waldmann, U. (eds.) CADE-30. LNCS, Springer, to appear

work page
[16]

Janota, M.: Experimental results for vampire on the equational theories project (2025),https://arxiv.org/abs/2508.15856

work page arXiv 2025
[17]

In: Gramlich, B., Miller, D., Sattler, U

Järvisalo, M., Heule, M., Biere, A.: Inprocessing rules. In: Gramlich, B., Miller, D., Sattler, U. (eds.) IJCAR 2012. LNCS, vol. 7364, pp. 355–370. Springer (2012)

work page 2012
[18]

Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)

Kinyon, M.: Zulip post in “Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)”. Zulip (2025), https://leanprover.zulipchat.com/#narrow/channel/219941-Machine- Learning-for-Theorem-Proving/topic/A.20.28semi.29- autoformalization.20challenge.3A.20650.3D.3E448/near/518961204

work page arXiv 2025
[19]

In: Leech, J

Knuth, D.E., Bendix, P.B.: Simple word problems in universal algebras. In: Leech, J. (ed.) Computational Problems in Abstract Algebra, pp. 263–297. Pergamon Press (1970) 20 L. Kondylidou et al

work page 1970
[20]

Zenodo (2026),https://doi.org/10.5281/zenodo.18624123

Kondylidou, L., Blanchette, J., Heule, M.: Tao’s equational proof challenge accepted. Zenodo (2026),https://doi.org/10.5281/zenodo.18624123

work page doi:10.5281/zenodo.18624123 2026
[21]

In: CPP 2023

Limperg, J., From, A.H.: Aesop: White-box best-first proof search for Lean. In: CPP 2023. pp. 253–266. ACM (2023)

work page 2023
[22]

McCune, W.: Prover9 and Mace4 (2005–2010)

work page 2005
[23]

In: Biere, A., Lutz, C., Negri, S

Morrison, K., de Moura, L.:grind: An SMT-inspired tactic for Lean 4 (short paper—system description). In: Biere, A., Lutz, C., Negri, S. (eds.) IJCAR 2026. LNCS, Springer, to appear

work page 2026
[24]

In: Bertot, Y., Kutsia, T., Norrish, M

de Moura, L., Ullrich, S.: The Lean 4 theorem prover and programming language. In: Bertot, Y., Kutsia, T., Norrish, M. (eds.) CADE-28. LNCS, vol. 12699, pp. 625–635. Springer (2021)

work page 2021
[25]

In: ITP 2025

Norman, C., Avigad, J.: Canonical for automated theorem proving in Lean. In: ITP 2025. LIPIcs, vol. 352, pp. 1–20. Leibniz-Zentrum für Informatik (2025)

work page 2025
[26]

In: Platzer, A., Sutcliffe, G

Smallbone, N.: Twee: An equational theorem prover. In: Platzer, A., Sutcliffe, G. (eds.) CADE-28. LNCS, vol. 12699, pp. 602–613. Springer (2021)

work page 2021
[27]

Stachniak, Z.: Minimization of resolution proof systems. Fundam. Informaticae 14(1), 129–146 (1991)

work page 1991
[28]

AI Communications38, 3–20 (2025)

Sutcliffe, G.: The 12th IJCAR Automated Theorem Proving System Competition—CASC-J12. AI Communications38, 3–20 (2025)

work page 2025
[29]

In: Fontaine, P., Stump, A

Sutcliffe, G., Chang, C., McGuinness, D., Lebo, T., Ding, L., da Silva, P.P.: Combining proofs to form different proofs. In: Fontaine, P., Stump, A. (eds.) PxTP 2011. pp. 60–73. LNCS (2011)

work page 2011
[30]

Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)

Tao, T.: Zulip thread “Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)”. Zulip (2025), https://leanprover-community.github.io/archive/stream/219941-Machine- Learning-for-Theorem-Proving/

work page 2025
[31]

In: Clarke, E.M., Voronkov, A

Vyskočil, J., Stanovský, D., Urban, J.: Automated proof compression by invention of new definitions. In: Clarke, E.M., Voronkov, A. (eds.) LPAR-16. LNCS, vol. 6355, pp. 447–462. Springer (2010)

work page 2010
[32]

Zhu, T., Clune, J., Avigad, J., Jiang, A.Q., Welleck, S.: Premise selection for a Lean hammer (2026),https://arxiv.org/abs/2506.07477

work page arXiv 2026

[1] [1]

In: Merz, S., Nipkow, T

Amjad, H.: Compressing propositional refutations. In: Merz, S., Nipkow, T. (eds.) AVoCS 2006. Electronic Notes in Theoretical Computer Science, vol. 185, pp. 3–15. Elsevier (2006)

work page 2006

[2] [2]

In: Aït-Kaci, H., Nivat, M

Bachmair, L., Dershowitz, N., Plaisted, D.A.: Completion without failure. In: Aït-Kaci, H., Nivat, M. (eds.) Rewriting Techniques, pp. 1–30. Academic Press (1989)

work page 1989

[3] [3]

In: Kirchner, C., Kirchner, H

Bachmair, L., Ganzinger, H.: Strict basic superposition. In: Kirchner, C., Kirchner, H. (eds.) CADE 1998. LNCS, vol. 1421, pp. 160–174. Springer (1998)

work page 1998

[4] [4]

In: Piskac, R., Rakamaric, Z

Bártek, F., Bhayat, A., Coutelier, R., Hajdú, M., Hetzenberger, M., Hozzová, P., Kovács, L., Rath, J., Rawson, M., Reger, G., Suda, M., Schoisswohl, J., Voronkov, A.: The Vampire diary. In: Piskac, R., Rakamaric, Z. (eds.) CAV 2025. LNCS, vol. 15933, pp. 57–71. Springer (2025) Tao’s Equational Proof Challenge Accepted (Technical Report) 19

work page 2025

[5] [5]

Blanchette, J.C., Böhme, S., Fleury, M., Smolka, S.J., Steckermeier, A.: Semi-intelligible Isar proofs from machine-generated proofs. J. Automated Reas. 56, 155–200 (2016)

work page 2016

[6] [6]

Bolan, M., Breitner, J., Brox, J., Carlini, N., Carneiro, M., van Doorn, F., Dvorak, M., Goens, A., Hill, A., Husum, H., Mejia, H.I., Kocsis, Z.A., Floch, B.L., Bar-on, A.L., Luccioli, L., McNeil, D., Meiburg, A., Monticone, P., Nielsen, P., Osazuwa, E.O., Paolini, G., Petracci, M., Reinke, B., Renshaw, D., Rossel, M., Roux, C., Scanvic, J., Srinivas, S.,...

work page 2025

[7] [7]

Buch, A., Hillenbrand, T.: WALDMEISTER: Development of a high performance completion-based theorem prover. Tech. rep. (1996)

work page 1996

[8] [8]

In: Bertot, Y., Kutsia, T., Norrish, M

Clune, J., Qian, Y., Bentkamp, A., Avigad, J.: Duper: A proof-producing superposition theorem prover for dependent type theory. In: Bertot, Y., Kutsia, T., Norrish, M. (eds.) ITP 2024. LIPIcs, vol. 309, pp. 1–20. Leibniz-Zentrum für Informatik (2024)

work page 2024

[9] [9]

In: Strichman, O., Szeider, S

Cotton, S.: Two techniques for minimizing resolution proofs. In: Strichman, O., Szeider, S. (eds.) SAT 2010. LNCS, vol. 6175, pp. 306–312. Springer (2010)

work page 2010

[10] [10]

In: Hong, H

Denzinger, J., Schulz, S.: Recording, analyzing and presenting distributed deduction processes. In: Hong, H. (ed.) PASCO 1994. Lecture Notes Series in Computing, vol. 5, pp. 114–123. World Scientific (1994)

work page 1994

[11] [11]

Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)

Floch, B.L.: Zulip post in “Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)”. Zulip (2025), https://leanprover.zulipchat.com/#narrow/channel/219941-Machine- Learning-for-Theorem-Proving/topic/A.20.28semi.29- autoformalization.20challenge.3A.20650.3D.3E448/near/518970125

work page arXiv 2025

[12] [12]

Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)

Floch, B.L.: Zulip post in “Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)”. Zulip (2025), https://leanprover.zulipchat.com/#narrow/channel/219941-Machine- Learning-for-Theorem-Proving/topic/A.20.28semi.29- autoformalization.20challenge.3A.20650.3D.3E448/near/568313777

work page arXiv 2025

[13] [13]

In: Benzmüller, C., Heule, M., Schmidt, R

Geoff Sutcliffe: Stepping stones in the TPTP World. In: Benzmüller, C., Heule, M., Schmidt, R. (eds.) IJCAR 2024. pp. 30–50. LNCS (2024)

work page 2024

[14] [14]

Gu, A., Piotrowski, B., Gloeckle, F., Yang, K., Markosyan, A.H.: ProofOptimizer: Training language models to simplify proofs without human demonstrations (2025),https://arxiv.org/abs/2510.15700

work page arXiv 2025

[15] [15]

In: Barrett, C.W., Waldmann, U

Hajdu, M., Coutelier, R., Kovács, L., Voronkov, A.: Term ordering diagrams. In: Barrett, C.W., Waldmann, U. (eds.) CADE-30. LNCS, Springer, to appear

work page

[16] [16]

Janota, M.: Experimental results for vampire on the equational theories project (2025),https://arxiv.org/abs/2508.15856

work page arXiv 2025

[17] [17]

In: Gramlich, B., Miller, D., Sattler, U

Järvisalo, M., Heule, M., Biere, A.: Inprocessing rules. In: Gramlich, B., Miller, D., Sattler, U. (eds.) IJCAR 2012. LNCS, vol. 7364, pp. 355–370. Springer (2012)

work page 2012

[18] [18]

Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)

Kinyon, M.: Zulip post in “Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)”. Zulip (2025), https://leanprover.zulipchat.com/#narrow/channel/219941-Machine- Learning-for-Theorem-Proving/topic/A.20.28semi.29- autoformalization.20challenge.3A.20650.3D.3E448/near/518961204

work page arXiv 2025

[19] [19]

In: Leech, J

Knuth, D.E., Bendix, P.B.: Simple word problems in universal algebras. In: Leech, J. (ed.) Computational Problems in Abstract Algebra, pp. 263–297. Pergamon Press (1970) 20 L. Kondylidou et al

work page 1970

[20] [20]

Zenodo (2026),https://doi.org/10.5281/zenodo.18624123

Kondylidou, L., Blanchette, J., Heule, M.: Tao’s equational proof challenge accepted. Zenodo (2026),https://doi.org/10.5281/zenodo.18624123

work page doi:10.5281/zenodo.18624123 2026

[21] [21]

In: CPP 2023

Limperg, J., From, A.H.: Aesop: White-box best-first proof search for Lean. In: CPP 2023. pp. 253–266. ACM (2023)

work page 2023

[22] [22]

McCune, W.: Prover9 and Mace4 (2005–2010)

work page 2005

[23] [23]

In: Biere, A., Lutz, C., Negri, S

Morrison, K., de Moura, L.:grind: An SMT-inspired tactic for Lean 4 (short paper—system description). In: Biere, A., Lutz, C., Negri, S. (eds.) IJCAR 2026. LNCS, Springer, to appear

work page 2026

[24] [24]

In: Bertot, Y., Kutsia, T., Norrish, M

de Moura, L., Ullrich, S.: The Lean 4 theorem prover and programming language. In: Bertot, Y., Kutsia, T., Norrish, M. (eds.) CADE-28. LNCS, vol. 12699, pp. 625–635. Springer (2021)

work page 2021

[25] [25]

In: ITP 2025

Norman, C., Avigad, J.: Canonical for automated theorem proving in Lean. In: ITP 2025. LIPIcs, vol. 352, pp. 1–20. Leibniz-Zentrum für Informatik (2025)

work page 2025

[26] [26]

In: Platzer, A., Sutcliffe, G

Smallbone, N.: Twee: An equational theorem prover. In: Platzer, A., Sutcliffe, G. (eds.) CADE-28. LNCS, vol. 12699, pp. 602–613. Springer (2021)

work page 2021

[27] [27]

Stachniak, Z.: Minimization of resolution proof systems. Fundam. Informaticae 14(1), 129–146 (1991)

work page 1991

[28] [28]

AI Communications38, 3–20 (2025)

Sutcliffe, G.: The 12th IJCAR Automated Theorem Proving System Competition—CASC-J12. AI Communications38, 3–20 (2025)

work page 2025

[29] [29]

In: Fontaine, P., Stump, A

Sutcliffe, G., Chang, C., McGuinness, D., Lebo, T., Ding, L., da Silva, P.P.: Combining proofs to form different proofs. In: Fontaine, P., Stump, A. (eds.) PxTP 2011. pp. 60–73. LNCS (2011)

work page 2011

[30] [30]

Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)

Tao, T.: Zulip thread “Machine Learning for Theorem Proving: A (Semi)-Autoformalization Challenge (650→448)”. Zulip (2025), https://leanprover-community.github.io/archive/stream/219941-Machine- Learning-for-Theorem-Proving/

work page 2025

[31] [31]

In: Clarke, E.M., Voronkov, A

Vyskočil, J., Stanovský, D., Urban, J.: Automated proof compression by invention of new definitions. In: Clarke, E.M., Voronkov, A. (eds.) LPAR-16. LNCS, vol. 6355, pp. 447–462. Springer (2010)

work page 2010

[32] [32]

Zhu, T., Clune, J., Avigad, J., Jiang, A.Q., Welleck, S.: Premise selection for a Lean hammer (2026),https://arxiv.org/abs/2506.07477

work page arXiv 2026