{"work":{"id":"cfdb69bb-3c0b-41d3-bd34-3167a6931bb2","openalex_id":null,"doi":null,"arxiv_id":"2508.03613","raw_key":null,"title":"Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction","authors":null,"authors_text":"Yong Lin, Shange Tang, Bohan Lyu, Ziran Yang, Jui-Hui Chung, Haoyu Zhao","year":2025,"venue":"cs.LG","abstract":"We introduce Goedel-Prover-V2, a series of open-source language models that set a new state-of-the-art in automated theorem proving. Built on the standard expert iteration and reinforcement learning pipeline, our approach incorporates three key innovations: (1) Scaffolded data synthesis: We generate synthetic tasks of increasing difficulty to train the model to master increasingly complex theorems; (2) Verifier-guided self-correction: We enable the model to iteratively revise its proofs by leveraging feedback from the Lean compiler; (3) Model averaging: We merge model checkpoints to mitigate the decrease in model output diversity in later stages of training. Our small model, Goedel-Prover-V2-8B, reaches 84.6% pass@32 on MiniF2F and outperforms DeepSeek-Prover-V2-671B under the same metric, despite being 80X smaller. Our flagship model, Goedel-Prover-V2-32B, achieves 88.1% on MiniF2F at pass@32 in standard mode and 90.4% in self-correction mode, outperforming prior SOTA by a large margin. Additionally, our flagship model solves 86 problems on PutnamBench at pass@184, securing the first place among open-source models on the leaderboard, surpassing DeepSeek-Prover-V2-671B's record of solving 47 problems by pass@1024 with a significantly smaller model size and compute budget. At the time of its release (July-August 2025), Goedel-Prover-V2 achieves the strongest overall performance among all open-source theorem provers. It also ranks among the top-performing models--including closed-source systems with publicly reported performance--under a constrained test-time compute budget. Our models, code, and data are released at https://github.com/Goedel-LM/Goedel-Prover-V2.","external_url":"https://arxiv.org/abs/2508.03613","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-05-25T07:50:29.449254+00:00","pith_arxiv_id":"2508.03613","created_at":"2026-05-10T10:29:25.479844+00:00","updated_at":"2026-05-25T07:50:29.449254+00:00","title_quality_ok":true,"display_title":"Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction","render_title":"Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction"},"hub":{"state":{"work_id":"cfdb69bb-3c0b-41d3-bd34-3167a6931bb2","tier":"hub","tier_reason":"10+ Pith inbound or 1,000+ external citations","pith_inbound_count":26,"external_cited_by_count":null,"distinct_field_count":6,"first_pith_cited_at":"2025-09-16T06:48:11+00:00","last_pith_cited_at":"2026-05-22T15:41:27+00:00","author_build_status":"not_needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"not_needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-06-05T17:59:27.495235+00:00","tier_text":"hub"},"tier":"hub","role_counts":[{"context_role":"background","n":5}],"polarity_counts":[{"context_polarity":"background","n":5}],"runs":{},"summary":{},"graph":{},"authors":[]}}