Synthesizing Inductive Invariants for Distributed Protocols via IC3 and Large Language Models

Guangyuan Wu; Hengfeng Wei; Taolue Chen; Weining Cao; Xiaoxing Ma; Yuan Yao

arxiv: 2605.24619 · v1 · pith:C2VVFW33new · submitted 2026-05-23 · 💻 cs.SE

Synthesizing Inductive Invariants for Distributed Protocols via IC3 and Large Language Models

Weining Cao , Guangyuan Wu , Yuan Yao , Hengfeng Wei , Taolue Chen , Xiaoxing Ma This is my paper

Pith reviewed 2026-06-30 13:01 UTC · model grok-4.3

classification 💻 cs.SE

keywords inductive invariantsdistributed protocolsTLA+IC3large language modelssafety verificationneuro-symbolic methodsprotocol verification

0 comments

The pith

IC3Syn combines IC3 with LLMs to synthesize inductive invariants for all 29 tested distributed protocols in TLA+.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces IC3Syn to automatically find inductive invariants needed to prove safety in distributed protocols. These invariants must hold initially and be preserved by transitions while implying the safety property. Existing methods either limit the protocol models or require human-written templates. IC3Syn uses a symbolic controller to create specific tasks and an LLM to reason about the protocol at a high level. It succeeds on every protocol in the evaluation set, with the invariants verified to work for the complete unbounded versions using a theorem prover.

Core claim

IC3Syn executes an IC3-style process over TLA+ states where the IC3 controller decomposes the invariant search into blocking tasks and the LLM provides the missing protocol-level reasoning, enabling discovery of invariants on finite instances that TLAPS confirms are inductive for the full protocol.

What carries the argument

The IC3Syn neuro-symbolic framework that pairs a symbolic IC3 controller for task decomposition with LLM protocol-level reasoning.

If this is right

Safety can be established for protocols like MongoLoglessDynamicRaft that previous tools could not handle.
The approach applies to consensus, reconfiguration, and client-server protocols without logical restrictions.
Invariants synthesized on finite instances transfer to unbounded models.
No manual templates are required for the synthesis process.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This integration could reduce the need for expert-crafted templates in verifying additional classes of systems.
Similar neuro-symbolic loops might be tested on liveness properties or other specification languages.
The results on industrial-scale examples point to possible use in automated protocol design pipelines.

Load-bearing premise

The invariants discovered on finite instances remain inductive when applied to the full unbounded protocol without subtle flaws that the theorem prover overlooks.

What would settle it

A case where the candidate invariants pass checks on finite instances but fail to be inductive or imply safety when checked by TLAPS on the unbounded TLA+ model.

Figures

Figures reproduced from arXiv: 2605.24619 by Guangyuan Wu, Hengfeng Wei, Taolue Chen, Weining Cao, Xiaoxing Ma, Yuan Yao.

**Figure 1.** Figure 1: Overview of IC3Syn. The IC3 loop iterates through four stages: frontier query, backward predecessor search, blocking-clause search, and clause push. They are coordinated by three components: (1) an IC3 controller that maintains frames and manages backward search from frontier bad states, (2) a verification backend where Apalache handles symbolic witness queries and TLC screens candidate blocking clauses ag… view at source ↗

read the original abstract

Distributed protocols are notoriously difficult to verify correctly. Proving safety typically requires inductive invariants that both imply the desired property and are preserved by every protocol transition; yet inferring such invariants remains a major bottleneck: existing approaches either restrict the protocol models to a decidable fragment of first-order logic or demand expert-crafted templates. We present IC3Syn, a neuro-symbolic framework that synthesizes inductive invariants by executing an IC3-style process over TLA+ states with the assistance of Large Language Models (LLMs). At large, IC3Syn combines a symbolic IC3 controller, which decomposes invariant synthesis into focused blocking tasks and an LLM which provides protocol-level reasoning that IC3 alone lacks for TLA+ specifications. This integration enables a disciplined yet flexible search for invariants without imposing logical restrictions or requiring manual templates. We evaluate IC3Syn on 29 distributed protocols spanning consensus, reconfiguration and client-server systems, and compare it against Endive, IC3PO, SWISS and DistAI. IC3Syn discovers candidate invariants for all 29 protocols, including MongoLoglessDynamicRaft (MLDR), an industrial-scale Raft-based reconfiguration protocol for which none of the compared tools reports a solution, as well as one complex Paxos variant. In each case, the invariants synthesized on finite instances are shown in TLAPS to be inductive for the full unbounded protocol, thereby establishing safety.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

IC3Syn gets invariants for all 29 TLA+ protocols tested, including the hard MLDR case others miss, by pairing IC3 blocking with LLM reasoning, then checks them in TLAPS.

read the letter

The main point is that IC3Syn runs an IC3-style loop on finite instances of TLA+ specs and uses an LLM to generate the protocol-level clauses that pure symbolic search struggles with. It reports candidate invariants for every one of the 29 protocols, including the industrial MongoLoglessDynamicRaft reconfiguration protocol and a complex Paxos variant, and then shows via TLAPS that those invariants are inductive over the full unbounded models.

What the paper does well is demonstrate concrete progress on a known bottleneck without forcing the spec into a decidable fragment or requiring hand-crafted templates. The direct comparison to Endive, IC3PO, SWISS, and DistAI highlights the gain on the larger examples.

The soft spots are in the LLM side and the finite-to-unbounded transfer. The abstract gives no numbers on prompt strategies, rejection rates, or how many LLM outputs were discarded, so it is hard to judge how much of the result depends on tuning versus the core method. The stress-test concern about whether the chosen finite bounds surface every invariant needed for the unbounded case is reasonable to raise, even though TLAPS is run on the final set.

This work is aimed at people building or using automated tools for distributed-systems verification. It is worth sending to peer review so the experimental details and the exact LLM integration can be examined.

Referee Report

2 major / 2 minor

Summary. The paper presents IC3Syn, a neuro-symbolic framework that combines a symbolic IC3 controller with LLMs to synthesize inductive invariants for TLA+ specifications of distributed protocols. The approach decomposes synthesis into focused blocking tasks on finite instances, using the LLM for protocol-level reasoning to generate candidate clauses. It reports success on all 29 evaluated protocols (including MongoLoglessDynamicRaft and a complex Paxos variant), with the synthesized invariants verified via TLAPS to be inductive and to imply safety over the full unbounded models, outperforming Endive, IC3PO, SWISS, and DistAI.

Significance. If the results hold, the work is significant for reducing reliance on expert templates or decidable fragments in distributed protocol verification. Success on an industrial-scale reconfiguration protocol like MLDR, where prior tools report no solution, is a notable practical advance. Explicit credit is due for the machine-checked TLAPS proofs establishing inductiveness and safety on the unbounded TLA+ models, and for the reproducible empirical comparison across multiple baselines.

major comments (2)

[Evaluation section] Evaluation section (and abstract): The 100% success rate on 29 protocols is reported without quantitative data on LLM usage, such as the number of candidates generated per protocol, rejection rates, prompt variations tested, or failure modes encountered (especially for MLDR). This information is load-bearing for assessing the reliability and reproducibility of the LLM component in the neuro-symbolic integration.
[Approach and evaluation sections] Approach and evaluation sections: The method relies on finite-instance IC3 blocking tasks to surface candidates that TLAPS then certifies for the unbounded model. No argument or sensitivity analysis is provided showing that the chosen finite bounds (e.g., for MLDR) are adequate to surface all invariants required for unbounded inductiveness, even though TLAPS accepts the final sets. This is load-bearing for the transfer claim underlying success on complex protocols.

minor comments (2)

[Evaluation section] The paper would benefit from a table summarizing per-protocol metrics (e.g., number of invariants found, TLAPS proof times) to complement the high-level success claims.
[§3] Notation for the IC3 blocking tasks and LLM interaction could be clarified with a small example in §3 to make the decomposition more accessible.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our submission. We address each major comment below and indicate where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [Evaluation section] Evaluation section (and abstract): The 100% success rate on 29 protocols is reported without quantitative data on LLM usage, such as the number of candidates generated per protocol, rejection rates, prompt variations tested, or failure modes encountered (especially for MLDR). This information is load-bearing for assessing the reliability and reproducibility of the LLM component in the neuro-symbolic integration.

Authors: We agree that quantitative details on LLM usage would strengthen reproducibility claims. In the revised manuscript we will add a new subsection (or table) in the evaluation reporting, for each of the 29 protocols: (i) total LLM queries issued, (ii) number of unique candidate clauses generated, (iii) rejection rate by the IC3 controller, and (iv) any prompt-template variations that were explored. For MLDR we will additionally document the specific failure modes observed before a successful invariant set was obtained. These statistics were recorded during our experiments and can be included without altering the core claims. revision: yes
Referee: [Approach and evaluation sections] Approach and evaluation sections: The method relies on finite-instance IC3 blocking tasks to surface candidates that TLAPS then certifies for the unbounded model. No argument or sensitivity analysis is provided showing that the chosen finite bounds (e.g., for MLDR) are adequate to surface all invariants required for unbounded inductiveness, even though TLAPS accepts the final sets. This is load-bearing for the transfer claim underlying success on complex protocols.

Authors: We acknowledge the value of an explicit argument for bound selection. The IC3 blocking phase operates only on finite instances to produce candidate clauses; the subsequent TLAPS proof establishes inductiveness and safety directly on the unbounded TLA+ model, which is the transfer guarantee. Nevertheless, we will insert a short paragraph in the approach section explaining the heuristic used to choose finite bounds (typically the smallest number of nodes/processes sufficient to expose all relevant blocking obligations) and will report, for MLDR, the concrete bounds employed together with a brief note that increasing those bounds did not yield additional necessary clauses. A full sensitivity sweep across multiple bound sizes for every protocol was not performed; we therefore mark this revision as partial. revision: partial

Circularity Check

0 steps flagged

No circularity; empirical synthesis and external verification are independent

full rationale

The paper describes a neuro-symbolic IC3+LLM framework evaluated empirically on 29 protocols against external baselines (Endive, IC3PO, SWISS, DistAI) with TLAPS certification on unbounded TLA+ models. No equations, fitted parameters, self-definitional loops, or load-bearing self-citations reduce any claimed result to its inputs by construction. The finite-to-unbounded transfer is presented as an empirical outcome verified externally rather than derived tautologically from the method's own definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Framework rests on standard TLA+ semantics and IC3 algorithm; no new free parameters, axioms beyond domain assumptions, or invented entities are introduced in the abstract.

axioms (1)

domain assumption TLA+ transition semantics are faithfully represented in the finite instances used by IC3
The IC3 controller operates directly on TLA+ states.

pith-pipeline@v0.9.1-grok · 5796 in / 1219 out tokens · 33459 ms · 2026-06-30T13:01:30.646032+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 29 canonical work pages · 2 internal anchors

[1]

Aaron R. Bradley. 2011. SAT-based model checking without unrolling. InInternational Conference on Verification, Model Checking, and Ab- stract Interpretation. Springer, Springer, Berlin, Heidelberg, 70–87. doi:10.1007/978-3-642-18275-4_7

work page doi:10.1007/978-3-642-18275-4_7 2011
[2]

Aaron R. Bradley. 2012. Understanding IC3. InInternational Conference on Theory and Applications of Satisfiability Testing (SAT). Springer, Berlin, Heidelberg, 1–14. doi:10.1007/978-3-642-31612-8_1

work page doi:10.1007/978-3-642-31612-8_1 2012
[3]

Weining Cao, Guangyuan Wu, Tangzhi Xu, Yuan Yao, Hengfeng Wei, Taolue Chen, and Xiaoxing Ma. 2025. Clause2Inv: A Generate- Combine-Check Framework for Loop Invariant Inference.Proceed- ings of the ACM on Software Engineering2, ISSTA (2025), 1009–1030. doi:10.1145/3728920

work page doi:10.1145/3728920 2025
[4]

Saikat Chakraborty, Shuvendu Lahiri, Sarah Fakhoury, Akash Lal, Madanlal Musuvathi, Aseem Rastogi, Aditya Senthilnathan, Rahul Sharma, and Nikhil Swamy. 2023. Ranking LLM-Generated Loop Invariants for Program Verification. InFindings of the Association for Computational Linguistics: EMNLP 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association...

work page doi:10.18653/v1/2023.findings-emnlp.614 2023
[5]

Kaustuv Chaudhuri, Damien Doligez, Leslie Lamport, and Stephan Merz. 2008. A TLA+ proof system. InProceedings of the LPAR Work- shop on Knowledge Exchange: Automated Provers and Proof Assis- tants (KEAPPA) (CEUR Workshop Proceedings, Vol. 418). CEUR-WS.org, Aachen, Germany, 17–37.https://ceur-ws.org/Vol-418/paper2.pdf

2008
[6]

Alessandro Cimatti and Alberto Griggio. 2016. Infinite-state invariant checking with IC3 and predicate abstraction.Formal Methods in System Design49, 3 (2016), 190–218. doi:10.1007/s10703-016-0257-4

work page doi:10.1007/s10703-016-0257-4 2016
[7]

Alessandro Cimatti, Alberto Griggio, Sergio Mover, and Stefano Tonetta. 2014. IC3 Modulo Theories via Implicit Predicate Abstraction. InTools and Algorithms for the Construction and Analysis of Systems (TACAS). Springer, Berlin, Heidelberg, 46–61. doi:10.1007/978-3-642- 54862-8_4

work page doi:10.1007/978-3-642- 2014
[8]

Niklas Eén, Alan Mishchenko, and Robert Brayton. 2011. Efficient implementation of property directed reachability. In2011 Formal Meth- ods in Computer-Aided Design (FMCAD). IEEE, Piscataway, NJ, USA, 125–134.https://ieeexplore.ieee.org/document/6148886/

work page arXiv 2011
[9]

Yotam M. Y. Feldman, James R. Wilcox, Sharon Shoham, and Mooly Sagiv. 2019. Inferring Inductive Invariants from Phase Structures. InComputer Aided Verification (CA V). Springer, Berlin, Heidelberg, 405–425. doi:10.1007/978-3-030-25543-5_23

work page doi:10.1007/978-3-030-25543-5_23 2019
[10]

Sakallah

Aman Goel and Karem A. Sakallah. 2021. On symmetry and quan- tification: A new approach to verify distributed protocols. InNASA Formal Methods: 13th International Symposium, NFM 2021, Virtual Event, May 24–28, 2021, Proceedings. Springer, Cham, Switzerland, 131–150. doi:10.1007/978-3-030-76384-8_9

work page doi:10.1007/978-3-030-76384-8_9 2021
[11]

Jim Gray and Leslie Lamport. 2006. Consensus on Transaction Commit. ACM Transactions on Database Systems31, 1 (2006), 133–160. doi:10. 1145/1132863.1132867

work page arXiv 2006
[12]

Travis Hance, Marijn Heule, Ruben Martins, and Bryan Parno. 2021. Finding Invariants of Distributed Systems: It’s a Small (Enough) World After All. In18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, Virtual Event, 115–131. https://www.usenix.org/conference/nsdi21/presentation/hance

2021
[13]

Lorch, Bryan Parno, Michael L

Chris Hawblitzel, Jon Howell, Manos Kapritsos, Jacob R. Lorch, Bryan Parno, Michael L. Roberts, Srinath Setty, and Brian Zill. 2015. IronFleet: Proving practical distributed systems correct. InProceedings of the 25th Symposium on Operating Systems Principles (SOSP). Association for Computing Machinery, New York, NY, USA, 1–17. doi:10.1145/2815400. 2815428

work page doi:10.1145/2815400 2015
[14]

Kryštof Hoder and Nikolaj Bjørner. 2012. Generalized property di- rected reachability. InInternational Conference on Theory and Appli- cations of Satisfiability Testing. Springer, Springer, Berlin, Heidelberg, 157–171. doi:10.1007/978-3-642-31612-8_13

work page doi:10.1007/978-3-642-31612-8_13 2012
[15]

Adharsh Kamath, Nausheen Mohammed, Aditya Senthilnathan, Saikat Chakraborty, Pantazis Deligiannis, Shuvendu K Lahiri, Akash Lal, Aseem Rastogi, Subhajit Roy, and Rahul Sharma. 2024. Leveraging LLMs for program verification. InFormal Methods in Computer-Aided Design (FMCAD). FMCAD, Inc., Prague, Czech Republic, 107–118. doi:10.34727/2024/isbn.978-3-85448-065-5_16

work page doi:10.34727/2024/isbn.978-3-85448-065-5_16 2024
[16]

Koenig, Oded Padon, Neil Immerman, and Alex Aiken

Jason R. Koenig, Oded Padon, Neil Immerman, and Alex Aiken. 2020. First-order quantified separators. InProceedings of the 41st ACM SIG- PLAN Conference on Programming Language Design and Implemen- tation. Association for Computing Machinery, New York, NY, USA, 703–717. doi:10.1145/3385412.3386018

work page doi:10.1145/3385412.3386018 2020
[17]

Igor Konnov, Jure Kukovec, and Thanh-Hai Tran. 2019. TLA+ model checking made symbolic.Proceedings of the ACM on Programming Lan- guages3, OOPSLA, Article 123 (2019), 30 pages. doi:10.1145/3360549

work page doi:10.1145/3360549 2019
[18]

Leslie Lamport. 1998. The Part-Time Parliament.ACM Transactions on Computer Systems16, 2 (1998), 133–169. doi:10.1145/279227.279229

work page doi:10.1145/279227.279229 1998
[19]

Leslie Lamport. 2001. Paxos Made Simple.ACM SIGACT News32, 4 (2001), 51–58.https://lamport.azurewebsites.net/pubs/paxos-simple. pdf

2001
[20]

2002.Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers

Leslie Lamport. 2002.Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley, Boston, MA.https://lamport.org/tla/book.html

2002
[21]

Sakallah

Haojun Ma, Aman Goel, Jean-Baptiste Jeannin, Manos Kapritsos, Baris Kasikci, and Karem A. Sakallah. 2019. I4: Incremental Inference of Inductive Invariants for Verification of Distributed Protocols. InPro- ceedings of the 27th ACM Symposium on Operating Systems Principles. Association for Computing Machinery, New York, NY, USA, 370–384. doi:10.1145/334130...

work page doi:10.1145/3341301.3359651 2019
[22]

McMillan

Kenneth L. McMillan. 2003. Interpolation and SAT-Based Model Check- ing. InComputer Aided Verification (CA V). Springer, Berlin, Heidelberg, 1–13. doi:10.1007/978-3-540-45069-6_1

work page doi:10.1007/978-3-540-45069-6_1 2003
[23]

Chris Newcombe, Tim Rath, Fan Zhang, Bogdan Munteanu, Marc Brooker, and Michael Deardeuff. 2015. How Amazon Web Services Uses Formal Methods.Commun. ACM58, 4 (2015), 66–73. doi:10.1145/ 2699417

2015
[24]

Oded Padon, Giuliano Losa, Mooly Sagiv, and Sharon Shoham. 2017. Paxos Made EPR: Decidable Reasoning about Distributed Protocols. Proceedings of the ACM on Programming Languages1, OOPSLA, Article 108 (2017), 31 pages. doi:10.1145/3140568

work page doi:10.1145/3140568 2017
[25]

McMillan, Aurojit Panda, Mooly Sagiv, and Sharon Shoham

Oded Padon, Kenneth L. McMillan, Aurojit Panda, Mooly Sagiv, and Sharon Shoham. 2016. Ivy: Safety verification by interactive generaliza- tion. InProceedings of the 37th ACM SIGPLAN Conference on Program- ming Language Design and Implementation. Association for Computing Machinery, New York, NY, USA, 614–630. doi:10.1145/2908080.2908118

work page doi:10.1145/2908080.2908118 2016
[26]

Muhammad A. A. Pirzada, Giles Reger, Ahmed Bhayat, and Lucas C. Cordeiro. 2024. LLM-Generated Invariants for Bounded Model Check- ing Without Loop Unrolling. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering. Asso- ciation for Computing Machinery, New York, NY, USA, 1395–1407. doi:10.1145/3691620.3695512

work page doi:10.1145/3691620.3695512 2024
[27]

William Schultz, Edward Ashton, Heidi Howard, and Stavros Tri- pakis. 2024. Scalable, Interpretable Distributed Protocol Verifi- cation by Inductive Proof Slicing. doi:10.48550/arXiv.2404.18048 arXiv:2404.18048 [cs.DC]

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2404.18048 2024
[28]

William Schultz, Ian Dardik, and Stavros Tripakis. 2022. Plain and simple inductive invariant inference for distributed protocols in TLA+. InProceedings of the 22nd Conference on Formal Methods in Computer- Aided Design (FMCAD). FMCAD, Inc., Austin, TX, USA, 273–283. doi:10. 34727/2022/isbn.978-3-85448-053-2_34

2022
[29]

Anjiang Wei, Tarun Suresh, Tianran Sun, Haoze Wu, Ke Wang, and Alex Aiken. 2025. Quokka: Accelerating Program Verification with LLMs via Invariant Synthesis. doi:10.48550/arXiv.2509.21629 arXiv:2509.21629 [cs.PL] 14

work page doi:10.48550/arxiv.2509.21629 2025
[30]

Cheng Wen, Jialun Cao, Jie Su, Zhiwu Xu, Shengchao Qin, Mengda He, Haokun Li, Shing-Chi Cheung, and Cong Tian. 2024. Enchanting Pro- gram Specification Synthesis by Large Language Models Using Static Analysis and Program Verification. InComputer Aided Verification: 36th International Conference, CA V 2024, Montreal, QC, Canada, July 24–27, 2024, Proceedin...

work page doi:10.1007/978-3-031-65630-9_16 2024
[31]

Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D

James R. Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Thomas Anderson. 2015. Verdi: A frame- work for implementing and formally verifying distributed systems. InProceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. Association for Computing Ma- chinery, New York, NY, USA, 3...

work page doi:10.1145/2737924.2737958 2015
[32]

Guangyuan Wu, Weining Cao, Yuan Yao, Hengfeng Wei, Taolue Chen, and Xiaoxing Ma. 2024. LLM Meets Bounded Model Checking: Neuro- symbolic Loop Invariant Inference. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering(Sacra- mento, CA, USA)(ASE ’24). Association for Computing Machinery, New York, NY, USA, 406–417. do...

work page doi:10.1145/3691620.3695014 2024
[33]

Haoze Wu, Clark Barrett, and Nina Narodytska. 2024. Lemur: Inte- grating Large Language Models in Automated Program Verification. https://openreview.net/forum?id=Q3YaCghZNtICLR 2024 poster, OpenReview.net

2024
[34]

VeruSAGE: A Study of Agent-Based Verification for Rust Systems

Chenyuan Yang, Natalie Neamtu, Chris Hawblitzel, Jacob R. Lorch, and Shan Lu. 2025. VeruSAGE: A Study of Agent-Based Verification for Rust Systems. doi:10.48550/arXiv.2512.18436arXiv:2512.18436 [cs.SE]

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2512.18436arxiv:2512.18436 2025
[35]

Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh, Suman Jana, and Gabriel Ryan. 2021. DistAI: Data-driven automated invariant learning for distributed protocols. In15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21). USENIX Association, Santa Clara, CA, 405–421.https://www.usenix.org/conference/osdi21/ presentation/yao

2021
[36]

Yuan Yu, Panagiotis Manolios, and Leslie Lamport. 1999. Model checking TLA+ specifications. InCorrect Hardware Design and Veri- fication Methods: 10th IFIP WG 10.5 Advanced Research Working Con- ference, CHARME’99. Springer, Springer, Berlin, Heidelberg, 54–66. doi:10.1007/3-540-48153-2_6

work page doi:10.1007/3-540-48153-2_6 1999
[37]

Tony Nuda Zhang, Travis Hance, Manos Kapritsos, Tej Chajed, and Bryan Parno. 2024. Inductive Invariants That Spark Joy: Using In- variant Taxonomies to Streamline Distributed Protocol Proofs. In18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24). USENIX Association, Santa Clara, CA, 837–853.https: //www.usenix.org/conference/osd...

2024
[38]

Tony Nuda Zhang, Keshav Singh, Tej Chajed, Manos Kapritsos, and Bryan Parno. 2025. Basilisk: Using Provenance Invariants to Auto- mate Proofs of Undecidable Protocols. In19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25). USENIX Association, Boston, MA, 1–18.https://www.usenix.org/conference/ osdi25/presentation/zhang-tonyBest ...

2025
[39]

Yuhao Zhou and Stavros Tripakis. 2025. Towards Language Model Guided TLA+ Proof Automation. doi:10.48550/arXiv.2512.09758 arXiv:2512.09758 [cs.LO] 15

work page doi:10.48550/arxiv.2512.09758 2025

[1] [1]

Aaron R. Bradley. 2011. SAT-based model checking without unrolling. InInternational Conference on Verification, Model Checking, and Ab- stract Interpretation. Springer, Springer, Berlin, Heidelberg, 70–87. doi:10.1007/978-3-642-18275-4_7

work page doi:10.1007/978-3-642-18275-4_7 2011

[2] [2]

Aaron R. Bradley. 2012. Understanding IC3. InInternational Conference on Theory and Applications of Satisfiability Testing (SAT). Springer, Berlin, Heidelberg, 1–14. doi:10.1007/978-3-642-31612-8_1

work page doi:10.1007/978-3-642-31612-8_1 2012

[3] [3]

Weining Cao, Guangyuan Wu, Tangzhi Xu, Yuan Yao, Hengfeng Wei, Taolue Chen, and Xiaoxing Ma. 2025. Clause2Inv: A Generate- Combine-Check Framework for Loop Invariant Inference.Proceed- ings of the ACM on Software Engineering2, ISSTA (2025), 1009–1030. doi:10.1145/3728920

work page doi:10.1145/3728920 2025

[4] [4]

Saikat Chakraborty, Shuvendu Lahiri, Sarah Fakhoury, Akash Lal, Madanlal Musuvathi, Aseem Rastogi, Aditya Senthilnathan, Rahul Sharma, and Nikhil Swamy. 2023. Ranking LLM-Generated Loop Invariants for Program Verification. InFindings of the Association for Computational Linguistics: EMNLP 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association...

work page doi:10.18653/v1/2023.findings-emnlp.614 2023

[5] [5]

Kaustuv Chaudhuri, Damien Doligez, Leslie Lamport, and Stephan Merz. 2008. A TLA+ proof system. InProceedings of the LPAR Work- shop on Knowledge Exchange: Automated Provers and Proof Assis- tants (KEAPPA) (CEUR Workshop Proceedings, Vol. 418). CEUR-WS.org, Aachen, Germany, 17–37.https://ceur-ws.org/Vol-418/paper2.pdf

2008

[6] [6]

Alessandro Cimatti and Alberto Griggio. 2016. Infinite-state invariant checking with IC3 and predicate abstraction.Formal Methods in System Design49, 3 (2016), 190–218. doi:10.1007/s10703-016-0257-4

work page doi:10.1007/s10703-016-0257-4 2016

[7] [7]

Alessandro Cimatti, Alberto Griggio, Sergio Mover, and Stefano Tonetta. 2014. IC3 Modulo Theories via Implicit Predicate Abstraction. InTools and Algorithms for the Construction and Analysis of Systems (TACAS). Springer, Berlin, Heidelberg, 46–61. doi:10.1007/978-3-642- 54862-8_4

work page doi:10.1007/978-3-642- 2014

[8] [8]

Niklas Eén, Alan Mishchenko, and Robert Brayton. 2011. Efficient implementation of property directed reachability. In2011 Formal Meth- ods in Computer-Aided Design (FMCAD). IEEE, Piscataway, NJ, USA, 125–134.https://ieeexplore.ieee.org/document/6148886/

work page arXiv 2011

[9] [9]

Yotam M. Y. Feldman, James R. Wilcox, Sharon Shoham, and Mooly Sagiv. 2019. Inferring Inductive Invariants from Phase Structures. InComputer Aided Verification (CA V). Springer, Berlin, Heidelberg, 405–425. doi:10.1007/978-3-030-25543-5_23

work page doi:10.1007/978-3-030-25543-5_23 2019

[10] [10]

Sakallah

Aman Goel and Karem A. Sakallah. 2021. On symmetry and quan- tification: A new approach to verify distributed protocols. InNASA Formal Methods: 13th International Symposium, NFM 2021, Virtual Event, May 24–28, 2021, Proceedings. Springer, Cham, Switzerland, 131–150. doi:10.1007/978-3-030-76384-8_9

work page doi:10.1007/978-3-030-76384-8_9 2021

[11] [11]

Jim Gray and Leslie Lamport. 2006. Consensus on Transaction Commit. ACM Transactions on Database Systems31, 1 (2006), 133–160. doi:10. 1145/1132863.1132867

work page arXiv 2006

[12] [12]

Travis Hance, Marijn Heule, Ruben Martins, and Bryan Parno. 2021. Finding Invariants of Distributed Systems: It’s a Small (Enough) World After All. In18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, Virtual Event, 115–131. https://www.usenix.org/conference/nsdi21/presentation/hance

2021

[13] [13]

Lorch, Bryan Parno, Michael L

Chris Hawblitzel, Jon Howell, Manos Kapritsos, Jacob R. Lorch, Bryan Parno, Michael L. Roberts, Srinath Setty, and Brian Zill. 2015. IronFleet: Proving practical distributed systems correct. InProceedings of the 25th Symposium on Operating Systems Principles (SOSP). Association for Computing Machinery, New York, NY, USA, 1–17. doi:10.1145/2815400. 2815428

work page doi:10.1145/2815400 2015

[14] [14]

Kryštof Hoder and Nikolaj Bjørner. 2012. Generalized property di- rected reachability. InInternational Conference on Theory and Appli- cations of Satisfiability Testing. Springer, Springer, Berlin, Heidelberg, 157–171. doi:10.1007/978-3-642-31612-8_13

work page doi:10.1007/978-3-642-31612-8_13 2012

[15] [15]

Adharsh Kamath, Nausheen Mohammed, Aditya Senthilnathan, Saikat Chakraborty, Pantazis Deligiannis, Shuvendu K Lahiri, Akash Lal, Aseem Rastogi, Subhajit Roy, and Rahul Sharma. 2024. Leveraging LLMs for program verification. InFormal Methods in Computer-Aided Design (FMCAD). FMCAD, Inc., Prague, Czech Republic, 107–118. doi:10.34727/2024/isbn.978-3-85448-065-5_16

work page doi:10.34727/2024/isbn.978-3-85448-065-5_16 2024

[16] [16]

Koenig, Oded Padon, Neil Immerman, and Alex Aiken

Jason R. Koenig, Oded Padon, Neil Immerman, and Alex Aiken. 2020. First-order quantified separators. InProceedings of the 41st ACM SIG- PLAN Conference on Programming Language Design and Implemen- tation. Association for Computing Machinery, New York, NY, USA, 703–717. doi:10.1145/3385412.3386018

work page doi:10.1145/3385412.3386018 2020

[17] [17]

Igor Konnov, Jure Kukovec, and Thanh-Hai Tran. 2019. TLA+ model checking made symbolic.Proceedings of the ACM on Programming Lan- guages3, OOPSLA, Article 123 (2019), 30 pages. doi:10.1145/3360549

work page doi:10.1145/3360549 2019

[18] [18]

Leslie Lamport. 1998. The Part-Time Parliament.ACM Transactions on Computer Systems16, 2 (1998), 133–169. doi:10.1145/279227.279229

work page doi:10.1145/279227.279229 1998

[19] [19]

Leslie Lamport. 2001. Paxos Made Simple.ACM SIGACT News32, 4 (2001), 51–58.https://lamport.azurewebsites.net/pubs/paxos-simple. pdf

2001

[20] [20]

2002.Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers

Leslie Lamport. 2002.Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley, Boston, MA.https://lamport.org/tla/book.html

2002

[21] [21]

Sakallah

Haojun Ma, Aman Goel, Jean-Baptiste Jeannin, Manos Kapritsos, Baris Kasikci, and Karem A. Sakallah. 2019. I4: Incremental Inference of Inductive Invariants for Verification of Distributed Protocols. InPro- ceedings of the 27th ACM Symposium on Operating Systems Principles. Association for Computing Machinery, New York, NY, USA, 370–384. doi:10.1145/334130...

work page doi:10.1145/3341301.3359651 2019

[22] [22]

McMillan

Kenneth L. McMillan. 2003. Interpolation and SAT-Based Model Check- ing. InComputer Aided Verification (CA V). Springer, Berlin, Heidelberg, 1–13. doi:10.1007/978-3-540-45069-6_1

work page doi:10.1007/978-3-540-45069-6_1 2003

[23] [23]

Chris Newcombe, Tim Rath, Fan Zhang, Bogdan Munteanu, Marc Brooker, and Michael Deardeuff. 2015. How Amazon Web Services Uses Formal Methods.Commun. ACM58, 4 (2015), 66–73. doi:10.1145/ 2699417

2015

[24] [24]

Oded Padon, Giuliano Losa, Mooly Sagiv, and Sharon Shoham. 2017. Paxos Made EPR: Decidable Reasoning about Distributed Protocols. Proceedings of the ACM on Programming Languages1, OOPSLA, Article 108 (2017), 31 pages. doi:10.1145/3140568

work page doi:10.1145/3140568 2017

[25] [25]

McMillan, Aurojit Panda, Mooly Sagiv, and Sharon Shoham

Oded Padon, Kenneth L. McMillan, Aurojit Panda, Mooly Sagiv, and Sharon Shoham. 2016. Ivy: Safety verification by interactive generaliza- tion. InProceedings of the 37th ACM SIGPLAN Conference on Program- ming Language Design and Implementation. Association for Computing Machinery, New York, NY, USA, 614–630. doi:10.1145/2908080.2908118

work page doi:10.1145/2908080.2908118 2016

[26] [26]

Muhammad A. A. Pirzada, Giles Reger, Ahmed Bhayat, and Lucas C. Cordeiro. 2024. LLM-Generated Invariants for Bounded Model Check- ing Without Loop Unrolling. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering. Asso- ciation for Computing Machinery, New York, NY, USA, 1395–1407. doi:10.1145/3691620.3695512

work page doi:10.1145/3691620.3695512 2024

[27] [27]

William Schultz, Edward Ashton, Heidi Howard, and Stavros Tri- pakis. 2024. Scalable, Interpretable Distributed Protocol Verifi- cation by Inductive Proof Slicing. doi:10.48550/arXiv.2404.18048 arXiv:2404.18048 [cs.DC]

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2404.18048 2024

[28] [28]

William Schultz, Ian Dardik, and Stavros Tripakis. 2022. Plain and simple inductive invariant inference for distributed protocols in TLA+. InProceedings of the 22nd Conference on Formal Methods in Computer- Aided Design (FMCAD). FMCAD, Inc., Austin, TX, USA, 273–283. doi:10. 34727/2022/isbn.978-3-85448-053-2_34

2022

[29] [29]

Anjiang Wei, Tarun Suresh, Tianran Sun, Haoze Wu, Ke Wang, and Alex Aiken. 2025. Quokka: Accelerating Program Verification with LLMs via Invariant Synthesis. doi:10.48550/arXiv.2509.21629 arXiv:2509.21629 [cs.PL] 14

work page doi:10.48550/arxiv.2509.21629 2025

[30] [30]

Cheng Wen, Jialun Cao, Jie Su, Zhiwu Xu, Shengchao Qin, Mengda He, Haokun Li, Shing-Chi Cheung, and Cong Tian. 2024. Enchanting Pro- gram Specification Synthesis by Large Language Models Using Static Analysis and Program Verification. InComputer Aided Verification: 36th International Conference, CA V 2024, Montreal, QC, Canada, July 24–27, 2024, Proceedin...

work page doi:10.1007/978-3-031-65630-9_16 2024

[31] [31]

Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D

James R. Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Thomas Anderson. 2015. Verdi: A frame- work for implementing and formally verifying distributed systems. InProceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. Association for Computing Ma- chinery, New York, NY, USA, 3...

work page doi:10.1145/2737924.2737958 2015

[32] [32]

Guangyuan Wu, Weining Cao, Yuan Yao, Hengfeng Wei, Taolue Chen, and Xiaoxing Ma. 2024. LLM Meets Bounded Model Checking: Neuro- symbolic Loop Invariant Inference. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering(Sacra- mento, CA, USA)(ASE ’24). Association for Computing Machinery, New York, NY, USA, 406–417. do...

work page doi:10.1145/3691620.3695014 2024

[33] [33]

Haoze Wu, Clark Barrett, and Nina Narodytska. 2024. Lemur: Inte- grating Large Language Models in Automated Program Verification. https://openreview.net/forum?id=Q3YaCghZNtICLR 2024 poster, OpenReview.net

2024

[34] [34]

VeruSAGE: A Study of Agent-Based Verification for Rust Systems

Chenyuan Yang, Natalie Neamtu, Chris Hawblitzel, Jacob R. Lorch, and Shan Lu. 2025. VeruSAGE: A Study of Agent-Based Verification for Rust Systems. doi:10.48550/arXiv.2512.18436arXiv:2512.18436 [cs.SE]

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2512.18436arxiv:2512.18436 2025

[35] [35]

Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh, Suman Jana, and Gabriel Ryan. 2021. DistAI: Data-driven automated invariant learning for distributed protocols. In15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21). USENIX Association, Santa Clara, CA, 405–421.https://www.usenix.org/conference/osdi21/ presentation/yao

2021

[36] [36]

Yuan Yu, Panagiotis Manolios, and Leslie Lamport. 1999. Model checking TLA+ specifications. InCorrect Hardware Design and Veri- fication Methods: 10th IFIP WG 10.5 Advanced Research Working Con- ference, CHARME’99. Springer, Springer, Berlin, Heidelberg, 54–66. doi:10.1007/3-540-48153-2_6

work page doi:10.1007/3-540-48153-2_6 1999

[37] [37]

Tony Nuda Zhang, Travis Hance, Manos Kapritsos, Tej Chajed, and Bryan Parno. 2024. Inductive Invariants That Spark Joy: Using In- variant Taxonomies to Streamline Distributed Protocol Proofs. In18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24). USENIX Association, Santa Clara, CA, 837–853.https: //www.usenix.org/conference/osd...

2024

[38] [38]

Tony Nuda Zhang, Keshav Singh, Tej Chajed, Manos Kapritsos, and Bryan Parno. 2025. Basilisk: Using Provenance Invariants to Auto- mate Proofs of Undecidable Protocols. In19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25). USENIX Association, Boston, MA, 1–18.https://www.usenix.org/conference/ osdi25/presentation/zhang-tonyBest ...

2025

[39] [39]

Yuhao Zhou and Stavros Tripakis. 2025. Towards Language Model Guided TLA+ Proof Automation. doi:10.48550/arXiv.2512.09758 arXiv:2512.09758 [cs.LO] 15

work page doi:10.48550/arxiv.2512.09758 2025