arxiv: 2605.14972 · v1 · submitted 2026-05-14 · 💻 cs.SE · cs.AI· cs.HC· cs.LO

Recognition: no theorem link

Viverra: Text-to-Code with Guarantees

Haoze Wu , Rocky Klopfenstein , Keith Farkas , Nina Narodytska

Authors on Pith no claims yet

Pith reviewed 2026-05-15 03:10 UTC · model grok-4.3

classification 💻 cs.SE cs.AIcs.HCcs.LO

keywords text-to-codeformal verificationassertionsbounded model checkingLLM code generationC programminguser studycode comprehension

0 comments

The pith

Viverra generates C code from natural language along with machine-verified assertions that improve human comprehension of the program.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Viverra to solve the core problem that text-to-code systems offer no correctness guarantees, forcing developers to review and test LLM output manually. It does this by prompting an LLM to produce both a C program and candidate assertions for safety and correctness properties, then running those assertions through a portfolio of bounded model checkers in a compositional best-effort manner. On 18 diverse tasks the system produces verified assertions efficiently. A user study with more than 400 participants shows that presenting these verified assertions measurably raises performance on code-comprehension tasks. The approach therefore aims to turn raw LLM code into annotated programs whose key properties are machine-checked rather than merely suggested.

Core claim

Given a natural-language task description, Viverra prompts an LLM to synthesize a C program together with candidate assertions expressing safety and correctness properties. It then verifies those assertions in a compositional and best-effort manner via a portfolio of bounded model checkers. Evaluation on 18 diverse programming tasks shows that Viverra can efficiently generate code with verified assertions, and that these assertions improve users' performance on code-comprehension tasks in a user study with more than 400 participants.

What carries the argument

Compositional best-effort verification of LLM-generated assertions by a portfolio of bounded model checkers, which filters and confirms safety and correctness properties about the generated C code.

If this is right

Developers receive some formal guarantees on generated code without having to write assertions themselves.
Verified annotations reduce the manual effort required to review and maintain LLM-produced programs.
Performance gains on comprehension tasks appear across 18 different programming problems.
The verification step can be run automatically after each LLM generation without changing the original prompt.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same pattern could be applied to languages other than C if suitable bounded model checkers exist for them.
Verified assertions might also support downstream tasks such as automated test generation or incremental maintenance.
If verification coverage remains low, the system could still be useful by surfacing the small set of confirmed properties rather than claiming full correctness.

Load-bearing premise

The LLM will reliably produce candidate assertions that are both relevant to the task and simple enough for bounded model checkers to verify within practical time limits.

What would settle it

A replication of the user study in which participants shown verified assertions score no higher on comprehension questions than participants shown only the raw generated code.

Figures

Figures reproduced from arXiv: 2605.14972 by Haoze Wu, Keith Farkas, Nina Narodytska, Rocky Klopfenstein.

**Figure 1.** Figure 1: Overview of VIVERRA. Bounded invariants are decidable and can be checked with a Bounded Model Checker via Definition 3; full invariance over all executions is undecidable in general. A k-bounded invariant provides a formal guarantee for all executions within the stated bound. Definition 5 (Assumption). Given a program P and a set of assertions S = {⟨f1, l1, ϕ1⟩, . . . ,⟨fn, ln, ϕn⟩}, Asm(P, S) denotes the … view at source ↗

**Figure 2.** Figure 2: Example of partial co-specification for a function in the ‘Bubble Sort’ program. From the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Comparison of correctness and user-timing between treatment and control HITs. The plots [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

read the original abstract

A fundamental limitation of Text-to-Code is that no guarantee can be obtained about the correctness of the generated code. Therefore, to ensure its correctness, the generated code still has to be reviewed, tested, and maintained by developers. However, parsing through LLM-generated code can be tedious and time-consuming, potentially negating the productivity gains promised by AI-coding tools. To address this challenge, we present Viverra, a system that automatically produces formally verified annotations alongside generated code to aid user's understanding of the generated program. Given a natural-language task description, Viverra prompts an LLM to synthesize a C program together with candidate assertions expressing safety and correctness properties. It then verifies those assertions in a compositional and best-effort manner via a portfolio of bounded model checkers. Evaluation on 18 diverse programming tasks suggests that Viverra can efficiently generate code with verified assertions, and that these assertions improve users' performance on code-comprehension tasks in a user study with more than 400 participants.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Viverra shows a practical pipeline for LLM-generated C code plus assertions verified by a portfolio of bounded model checkers, with evidence from 18 tasks and a 400+ person user study on comprehension gains.

read the letter

The core of the paper is a system that has an LLM produce C code together with candidate assertions, then feeds those assertions to a mix of bounded model checkers that try to verify them compositionally and best-effort. On 18 tasks it generates verified assertions reasonably often, and the user study finds that people perform better on code-comprehension questions when the assertions are shown alongside the code.

Referee Report

2 major / 2 minor

Summary. The manuscript presents Viverra, a system that prompts an LLM to generate C programs together with candidate assertions for safety and correctness properties, then verifies those assertions compositionally using a portfolio of bounded model checkers. Evaluation on 18 diverse tasks indicates efficient production of code containing verified assertions, while a user study with more than 400 participants reports that the presence of these assertions improves performance on code-comprehension tasks.

Significance. If the verification results are robust and the user-study findings hold after accounting for the bounded nature of the checks, the work offers a practical bridge between LLM code generation and formal methods. It could reduce manual review burden and improve developer understanding of generated programs, with the large-scale empirical component providing useful evidence of real-world utility.

major comments (2)

[Verification section] Verification section: The central claim of 'formally verified annotations' and 'guarantees' rests on bounded model checking. The manuscript does not report the concrete loop-unrolling bounds or search depths applied to each of the 18 tasks, nor does it provide evidence that these bounds suffice to cover all relevant behaviors for programs containing loops or recursion. Without such details the verified subset supplies only partial assurance, which directly affects the strength of the claim that the assertions deliver meaningful guarantees to users.
[Evaluation on 18 tasks] Evaluation on 18 tasks: The abstract states positive results, yet the manuscript should quantify, per task, how many candidate assertions were successfully verified, how many remained unverified or timed out, and whether any post-hoc selection of tasks or assertions occurred. These data are load-bearing for assessing whether the verified assertions actually provide substantial coverage rather than sporadic or trivial properties.

minor comments (2)

[User study] User-study description: Specify exactly how the verified assertions were presented to participants (e.g., highlighted, distinguished from unverified ones) and whether participants were told about the verification status.
[Terminology] Terminology: Ensure consistent distinction between 'verified' (bounded) and 'unverified' assertions throughout the text and figures to prevent readers from inferring stronger guarantees than the method supplies.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the thoughtful review. We address the major comments point-by-point below, and we will incorporate the suggested clarifications and additional data into the revised manuscript.

read point-by-point responses

Referee: [Verification section] The central claim of 'formally verified annotations' and 'guarantees' rests on bounded model checking. The manuscript does not report the concrete loop-unrolling bounds or search depths applied to each of the 18 tasks, nor does it provide evidence that these bounds suffice to cover all relevant behaviors for programs containing loops or recursion. Without such details the verified subset supplies only partial assurance, which directly affects the strength of the claim that the assertions deliver meaningful guarantees to users.

Authors: We agree that the bounded nature of the checks requires more explicit documentation. In the revision we will add a new paragraph (and accompanying table) in the Verification section that lists the exact loop-unrolling bounds, search depths, and solver configurations used for each of the 18 tasks. For the programs in our corpus the chosen bounds were sufficient to obtain definitive 'verified' outcomes from the model checkers; we will include summary verification logs to support this. We will also revise the abstract and introduction to state clearly that the guarantees are bounded yet practically useful for the program sizes considered. revision: yes
Referee: [Evaluation on 18 tasks] The abstract states positive results, yet the manuscript should quantify, per task, how many candidate assertions were successfully verified, how many remained unverified or timed out, and whether any post-hoc selection of tasks or assertions occurred. These data are load-bearing for assessing whether the verified assertions actually provide substantial coverage rather than sporadic or trivial properties.

Authors: We will add a detailed per-task breakdown to the Evaluation section. A new table will report, for each of the 18 tasks: number of candidate assertions generated, number successfully verified, number that timed out or remained unverified, and verification time. The 18 tasks were selected a priori according to a diversity rubric; no post-hoc filtering of tasks or assertions occurred. All generated assertions were submitted to the verification pipeline. This table will allow readers to judge the coverage achieved. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical system with direct measurements

full rationale

The paper describes an implemented pipeline (LLM prompt for code+assertions, verification via portfolio of bounded model checkers) and supports its claims exclusively via concrete evaluation on 18 tasks plus a user study with >400 participants. No equations, fitted parameters, or self-citations are used to derive the central results; the assertions are produced and checked externally by off-the-shelf model checkers. The work is therefore self-contained against external benchmarks and contains no load-bearing step that reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach rests on standard assumptions from LLM prompting and formal verification rather than introducing new free parameters or invented entities.

axioms (2)

domain assumption LLMs prompted with natural-language task descriptions can produce C code together with candidate assertions that capture relevant safety and correctness properties
This is the core premise of the synthesis step described in the abstract.
domain assumption Bounded model checkers can verify assertions in a compositional best-effort manner for the generated C programs
Invoked in the verification stage; soundness and completeness within bounds are taken from the model-checking literature.

pith-pipeline@v0.9.0 · 5481 in / 1384 out tokens · 52569 ms · 2026-05-15T03:10:22.758076+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

79 extracted references · 79 canonical work pages · 2 internal anchors

[1]

Evaluating Large Language Models Trained on Code

Evaluating large language models trained on code , author=. arXiv preprint arXiv:2107.03374 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[2]

Science , volume=

Competition-level code generation with alphacode , author=. Science , volume=. 2022 , publisher=

work page 2022
[3]

Program Synthesis with Large Language Models

Program synthesis with large language models , author=. arXiv preprint arXiv:2108.07732 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[4]

A Tool for Checking

Clarke, Edmund and Kroening, Daniel and Lerda, Flavio , booktitle=. A Tool for Checking. 2004 , organization=

work page 2004
[5]

IEEE Transactions on Software Engineering , volume=

Cordeiro, Lucas and Fischer, Bernd and Marques-Silva, Jo. IEEE Transactions on Software Engineering , volume=. 2012 , publisher=

work page 2012
[6]

Ernst, Michael D and Perkins, Jeff H and Guo, Philip J and McCamant, Stephen and Pacheco, Carlos and Tschantz, Matthew S and Xiao, Chen , journal=. The. 2007 , publisher=

work page 2007
[7]

2011 , organization=

Fraser, Gordon and Arcuri, Andrea , booktitle=. 2011 , organization=

work page 2011
[8]

arXiv preprint arXiv:2210.12283 , year=

Draft, sketch, and prove: Guiding formal theorem provers with informal proofs , author=. arXiv preprint arXiv:2210.12283 , year=

work page arXiv
[9]

The Twelfth International Conference on Learning Representations (ICLR) , year=

Lemur: Integrating Large Language Models in Automated Program Verification , author=. The Twelfth International Conference on Learning Representations (ICLR) , year=

work page
[10]

Haifa Verification Conference , pages=

Cube and conquer: Guiding CDCL SAT solvers by lookaheads , author=. Haifa Verification Conference , pages=. 2011 , organization=

work page 2011
[11]

2017 Formal Methods in Computer Aided Design (FMCAD) , pages=

Column-wise verification of multipliers using computer algebra , author=. 2017 Formal Methods in Computer Aided Design (FMCAD) , pages=. 2017 , organization=

work page 2017
[12]

International Conference on Computer Aided Verification , pages=

Marabou 2.0: a versatile formal analyzer of neural networks , author=. International Conference on Computer Aided Verification , pages=. 2024 , organization=

work page 2024
[13]

, author=

Partitioning Strategies for Distributed SMT Solving. , author=. FMCAD , pages=

work page
[14]

Theory and Applications of Satisfiability Testing--SAT 2015: 18th International Conference, Austin, TX, USA, September 24-27, 2015, Proceedings 18 , pages=

Search-space partitioning for parallelizing SMT solvers , author=. Theory and Applications of Satisfiability Testing--SAT 2015: 18th International Conference, Austin, TX, USA, September 24-27, 2015, Proceedings 18 , pages=. 2015 , organization=

work page 2015
[15]

\# PLACEHOLDER\_PARENT\_METADATA\_VALUE\# , volume=

Parallelization techniques for verifying neural networks , author=. \# PLACEHOLDER\_PARENT\_METADATA\_VALUE\# , volume=. 2020 , organization=

work page 2020
[16]

International Conference on Computer Aided Verification , pages=

Distributed SMT Solving Based on Dynamic Variable-Level Partitioning , author=. International Conference on Computer Aided Verification , pages=. 2024 , organization=

work page 2024
[17]

The American Statistician , volume=

Markov chain Monte Carlo in practice: a roundtable discussion , author=. The American Statistician , volume=. 1998 , publisher=

work page 1998
[18]

2011 , publisher=

Handbook of markov chain monte carlo , author=. 2011 , publisher=

work page 2011
[19]

The american statistician , volume=

Understanding the metropolis-hastings algorithm , author=. The american statistician , volume=. 1995 , publisher=

work page 1995
[20]

ACM SIGARCH Computer Architecture News , volume=

Stochastic superoptimization , author=. ACM SIGARCH Computer Architecture News , volume=. 2013 , publisher=

work page 2013
[21]

Handbook of Satisfiability , pages=

Automated configuration and selection of SAT solvers , author=. Handbook of Satisfiability , pages=. 2021 , publisher=

work page 2021
[22]

Annual Review of Statistics and Its Application , volume=

Convergence diagnostics for markov chain monte carlo , author=. Annual Review of Statistics and Its Application , volume=. 2020 , publisher=

work page 2020
[23]

Journal of Machine Learning Research , year =

Marius Lindauer and Katharina Eggensperger and Matthias Feurer and André Biedenkapp and Difan Deng and Carolin Benjamins and Tim Ruhkopf and René Sass and Frank Hutter , title =. Journal of Machine Learning Research , year =

work page
[24]

Theory and Applications of Satisfiability Testing--SAT 2020: 23rd International Conference, Alghero, Italy, July 3--10, 2020, Proceedings 23 , pages=

Distributed cube and conquer with paracooba , author=. Theory and Applications of Satisfiability Testing--SAT 2020: 23rd International Conference, Alghero, Italy, July 3--10, 2020, Proceedings 23 , pages=. 2020 , organization=

work page 2020
[25]

Computer Aided Verification: 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I 31 , pages=

The marabou framework for verification and analysis of deep neural networks , author=. Computer Aided Verification: 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I 31 , pages=. 2019 , organization=

work page 2019
[26]

2021 Formal Methods in Computer Aided Design (FMCAD) , pages=

Sat solving in the serverless cloud , author=. 2021 Formal Methods in Computer Aided Design (FMCAD) , pages=. 2021 , organization=

work page 2021
[27]

Armin Biere and Tobias Faller and Katalin Fazekas and Mathias Fleury and Nils Froleyks and Florian Pollitt , title =. Proc. of

work page
[28]

International Conference on Agents and Artificial Intelligence , pages=

Domain dependent parameter setting in sat solver using machine learning techniques , author=. International Conference on Agents and Artificial Intelligence , pages=. 2022 , organization=

work page 2022
[29]

Twenty-first international joint conference on artificial intelligence , year=

Predicting learnt clauses quality in modern SAT solvers , author=. Twenty-first international joint conference on artificial intelligence , year=

work page
[30]

27th International Conference on Theory and Applications of Satisfiability Testing (SAT 2024) , pages =

Iser, Markus and Jabs, Christoph , title =. 27th International Conference on Theory and Applications of Satisfiability Testing (SAT 2024) , pages =. 2024 , volume =. doi:10.4230/LIPIcs.SAT.2024.18 , annote =

work page doi:10.4230/lipics.sat.2024.18 2024
[31]

PL-PRS-BVA-KISSAT in SAT Competition 2024 , author=

work page 2024
[32]

Arithmetic verification problems submitted to the SAT Race 2019 , author=. Proc. of SAT Race , volume=

work page 2019
[33]

Computer Aided Verification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I 30 , pages=

Reluplex: An efficient SMT solver for verifying deep neural networks , author=. Computer Aided Verification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I 30 , pages=. 2017 , organization=

work page 2017
[34]

Journal of Machine Learning Research , volume=

Branch and bound for piecewise linear neural network verification , author=. Journal of Machine Learning Research , volume=

work page
[35]

Static Analysis: 28th International Symposium, SAS 2021, Chicago, IL, USA, October 17--19, 2021, Proceedings 28 , pages=

Verifying low-dimensional input neural networks via input quantization , author=. Static Analysis: 28th International Symposium, SAS 2021, Chicago, IL, USA, October 17--19, 2021, Proceedings 28 , pages=. 2021 , organization=

work page 2021
[36]

Formal Methods in Computer Aided Design (FMCAD'07) , pages=

Boosting verification by automatic tuning of decision procedures , author=. Formal Methods in Computer Aided Design (FMCAD'07) , pages=. 2007 , organization=

work page 2007
[37]

International Conference on Tools and Algorithms for the Construction and Analysis of Systems , pages=

MachSMT: A machine learning-based algorithm selector for SMT solvers , author=. International Conference on Tools and Algorithms for the Construction and Analysis of Systems , pages=. 2021 , organization=

work page 2021
[38]

2023 Formal Methods in Computer-Aided Design (FMCAD) , pages=

Lightweight Online Learning for Sets of Related Problems in Automated Reasoning , author=. 2023 Formal Methods in Computer-Aided Design (FMCAD) , pages=. 2023 , organization=

work page 2023
[39]

Handbook of satisfiability , pages=

Look-ahead based SAT solvers , author=. Handbook of satisfiability , pages=. 2009 , publisher=

work page 2009
[40]

2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE) , pages=

Property inference for deep neural networks , author=. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE) , pages=. 2019 , organization=

work page 2019
[41]

International Conference on Tools and Algorithms for the Construction and Analysis of Systems , pages=

Verifying learning-based robotic navigation systems , author=. International Conference on Tools and Algorithms for the Construction and Analysis of Systems , pages=. 2023 , organization=

work page 2023
[42]

Journal of artificial intelligence research , volume=

SATzilla: portfolio-based algorithm selection for SAT , author=. Journal of artificial intelligence research , volume=

work page
[43]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Hydra: Automatically configuring algorithms for portfolio-based selection , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[44]

, author=

Model-Based Genetic Algorithms for Algorithm Configuration. , author=. IJCAI , pages=

work page
[45]

Artificial Intelligence , volume=

Algorithm runtime prediction: Methods & evaluation , author=. Artificial Intelligence , volume=. 2014 , publisher=

work page 2014
[46]

Journal of the ACM (JACM) , volume=

Empirical hardness models: Methodology and a case study on combinatorial auctions , author=. Journal of the ACM (JACM) , volume=. 2009 , publisher=

work page 2009
[47]

Learning Rate Based Branching Heuristic for

Jia Hui Liang and Vijay Ganesh and Pascal Poupart and Krzysztof Czarnecki , editor =. Learning Rate Based Branching Heuristic for. Theory and Applications of Satisfiability Testing -. 2016 , url =. doi:10.1007/978-3-319-40970-2\_9 , timestamp =

work page doi:10.1007/978-3-319-40970-2 2016
[48]

Machine learning-based restart policy for CDCL SAT solvers , author=. Theory and Applications of Satisfiability Testing--SAT 2018: 21st International Conference, SAT 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 9--12, 2018, Proceedings 21 , pages=. 2018 , organization=

work page 2018
[49]

International Conference on Theory and Applications of Satisfiability Testing , pages=

Solving and verifying the boolean pythagorean triples problem via cube-and-conquer , author=. International Conference on Theory and Applications of Satisfiability Testing , pages=. 2016 , organization=

work page 2016
[50]

Aaai , volume=

Automatic algorithm configuration based on local search , author=. Aaai , volume=

work page
[51]

2021 Formal Methods in Computer Aided Design (FMCAD) , pages=

Lookahead in partitioning SMT , author=. 2021 Formal Methods in Computer Aided Design (FMCAD) , pages=. 2021 , organization=

work page 2021
[52]

Theory and Applications of Satisfiability Testing--SAT 2017: 20th International Conference, Melbourne, VIC, Australia, August 28--September 1, 2017, Proceedings 20 , pages=

A propagation rate based splitting heuristic for divide-and-conquer solvers , author=. Theory and Applications of Satisfiability Testing--SAT 2017: 20th International Conference, Melbourne, VIC, Australia, August 28--September 1, 2017, Proceedings 20 , pages=. 2017 , organization=

work page 2017
[53]

Principles and Practice of Constraint Programming: 26th International Conference, CP 2020, Louvain-la-Neuve, Belgium, September 7--11, 2020, Proceedings 26 , pages=

A machine learning based splitting heuristic for divide-and-conquer solvers , author=. Principles and Practice of Constraint Programming: 26th International Conference, CP 2020, Louvain-la-Neuve, Belgium, September 7--11, 2020, Proceedings 26 , pages=. 2020 , organization=

work page 2020
[54]

2009 , institution=

AvatarSAT: An auto-tuning boolean SAT solver , author=. 2009 , institution=

work page 2009
[55]

Artificial Intelligence , volume=

SATenstein: Automatically building local search SAT solvers from components , author=. Artificial Intelligence , volume=. 2016 , publisher=

work page 2016
[56]

Theory and Applications of Satisfiability Testing--SAT 2021: 24th International Conference, Barcelona, Spain, July 5-9, 2021, Proceedings 24 , pages=

MedleySolver: online SMT algorithm selection , author=. Theory and Applications of Satisfiability Testing--SAT 2021: 24th International Conference, Barcelona, Spain, July 5-9, 2021, Proceedings 24 , pages=. 2021 , organization=

work page 2021
[57]

and Wu, Haoze , year =

The Fifth International Verification of Neural Networks Competition (VNN-COMP 2024): Summary and Results , author=. arXiv preprint arXiv:2412.19985 , year=

work page arXiv 2024
[58]

, year =

The fourth international verification of neural networks competition (vnn-comp 2023): Summary and results , author=. arXiv preprint arXiv:2312.16760 , year=

work page arXiv 2023
[59]

arXiv preprint arXiv:2503.12083 , year=

Proof-Driven Clause Learning in Neural Network Verification , author=. arXiv preprint arXiv:2503.12083 , year=

work page arXiv
[60]

Theory and Applications of Satisfiability Testing--SAT 2007: 10th International Conference, Lisbon, Portugal, May 28-31, 2007

Combining adaptive noise and look-ahead in local search for SAT , author=. Theory and Applications of Satisfiability Testing--SAT 2007: 10th International Conference, Lisbon, Portugal, May 28-31, 2007. Proceedings 10 , pages=. 2007 , organization=

work page 2007
[61]

Theory and Applications of Satisfiability Testing--SAT 2008: 11th International Conference, SAT 2008, Guangzhou, China, May 12-15, 2008

Adaptive restart strategies for conflict driven SAT solvers , author=. Theory and Applications of Satisfiability Testing--SAT 2008: 11th International Conference, SAT 2008, Guangzhou, China, May 12-15, 2008. Proceedings 11 , pages=. 2008 , organization=

work page 2008
[62]

Proceedings of the 38th annual Design Automation Conference , pages=

Chaff: Engineering an efficient SAT solver , author=. Proceedings of the 38th annual Design Automation Conference , pages=

work page
[63]

27th International Conference on Principles and Practice of Constraint Programming , year=

Combining vsids and chb using restarts in sat , author=. 27th International Conference on Principles and Practice of Constraint Programming , year=

work page
[64]

Proceedings of the ACM on Programming Languages , volume=

An abstract domain for certifying neural networks , author=. Proceedings of the ACM on Programming Languages , volume=. 2019 , publisher=

work page 2019
[65]

International Conference on Tools and Algorithms for the Construction and Analysis of Systems , pages=

Efficient neural network analysis with sum-of-infeasibilities , author=. International Conference on Tools and Algorithms for the Construction and Analysis of Systems , pages=. 2022 , organization=

work page 2022
[66]

Journal of Artificial Intelligence Research , volume=

Automated dynamic algorithm configuration , author=. Journal of Artificial Intelligence Research , volume=

work page
[67]

International Conference on Computer Aided Verification , year=

NeuralSAT: A High-Performance Verification Tool for Deep Neural Networks , author=. International Conference on Computer Aided Verification , year=

work page
[68]

International Workshop on AI Verification (SAIV) , year=

Clover: Closed-Loop Verifiable Code Generation , author=. International Workshop on AI Verification (SAIV) , year=

work page
[69]

Proceedings of the ACM on Software Engineering , volume=

Towards ai-assisted synthesis of verified dafny methods , author=. Proceedings of the ACM on Software Engineering , volume=. 2024 , publisher=

work page 2024
[70]

arXiv preprint arXiv:2410.15756 , year=

Automated proof generation for rust code via self-evolution , author=. arXiv preprint arXiv:2410.15756 , year=

work page arXiv
[71]

Verified Code Transpilation with

Bhatia, Sahil and Qiu, Jie and Hasabnis, Niranjan and Seshia, Sanjit A and Cheung, Alvin , booktitle=. Verified Code Transpilation with

work page
[72]

arXiv preprint arXiv:2509.22908 , year=

A benchmark for vericoding: formally verified program synthesis , author=. arXiv preprint arXiv:2509.22908 , year=

work page arXiv
[73]

Proceedings of the ACM on Software Engineering (ESEC/FSE) , year=

Baldur: Whole-Proof Generation and Repair with Large Language Models , author=. Proceedings of the ACM on Software Engineering (ESEC/FSE) , year=

work page
[74]

International Conference on Computer Aided Verification , pages=

Bitwuzla , author=. International Conference on Computer Aided Verification , pages=. 2023 , organization=

work page 2023
[75]

de Moura, Leonardo and Ullrich, Sebastian , booktitle=. The. 2021 , publisher=

work page 2021
[76]

Interactive Theorem Proving and Program Development:

Bertot, Yves and Cast. Interactive Theorem Proving and Program Development:. 2004 , publisher=

work page 2004
[77]

arXiv preprint arXiv:2510.12702 , year=

Beyond Postconditions: Can Large Language Models infer Formal Contracts for Automatic Software Verification? , author=. arXiv preprint arXiv:2510.12702 , year=

work page arXiv
[78]

arXiv preprint arXiv:2503.19599 , year=

HoarePrompt: Structural Reasoning About Program Correctness in Natural Language , author=. arXiv preprint arXiv:2503.19599 , year=

work page arXiv
[79]

Jimenez, Carlos E and Yang, John and Wettig, Alexander and Yao, Shunyu and Pei, Kexin and Press, Ofir and Narasimhan, Karthik , booktitle=

work page