pith. sign in

arxiv: 2606.17374 · v1 · pith:PTQFJWJ5new · submitted 2026-06-16 · 💻 cs.LO · cs.PL· cs.SE

Verifying the Rust Standard Library

Pith reviewed 2026-06-26 22:56 UTC · model grok-4.3

classification 💻 cs.LO cs.PLcs.SE
keywords Ruststandard libraryunsafe codeformal verificationcrowdsourcingcontinuous integrationundefined behaviormemory safety
0
0 comments X

The pith

An open crowdsourced effort has integrated multiple verification tools into continuous integration to produce machine-checked proofs for unsafe code in the Rust standard library.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper describes a verification campaign on a fork of the Rust standard library in which contributors add machine-checked proofs that certain unsafe operations do not produce out-of-bounds access, null or dangling pointer dereference, or use of uninitialized memory. These proofs are checked automatically whenever the library code changes. A sympathetic reader would care because Rust's type system rules out many memory errors only if the unsafe code inside the standard library itself is correct, and existing testing leaves some of those errors undetected. The work claims to be the largest such campaign reported for any software library and treats the remaining unverified parts as open problems rather than solved.

Core claim

We present what is, to the best of our knowledge, the largest verification campaign reported for a software library: an open, crowdsourced effort that integrates complementary verification tools into the continuous integration of a verification repository forked from the Rust standard library. We analyze the campaign's effectiveness, discuss the practical value of machine-checked proofs for a subset of undefined behaviors (e.g., out-of-bounds access, null and dangling pointer dereferences, and use of uninitialized memory), and frame the remaining obstacles as open challenges for the formal-methods community.

What carries the argument

Integration of complementary verification tools into the continuous integration pipeline of a forked Rust standard library repository, with open crowdsourced proof contributions.

If this is right

  • Machine-checked proofs can give stronger guarantees than testing alone for the targeted undefined behaviors inside the standard library.
  • Complementary verification tools can be combined effectively inside a single continuous-integration workflow for a large code base.
  • An open repository model allows ongoing maintenance of proofs as the library evolves.
  • The campaign surfaces concrete open challenges for tool builders rather than claiming full coverage.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar crowdsourced verification repositories could be created for other widely used libraries that contain unsafe or low-level code.
  • Success at this scale would strengthen the case for treating machine-checked absence of specific undefined behaviors as a practical quality metric for language implementations.
  • The open nature of the effort creates a living artifact that future changes to Rust can be checked against without restarting from scratch.

Load-bearing premise

The verification tools are sound with respect to Rust's operational semantics and correctly capture the targeted classes of undefined behavior.

What would settle it

Discovery of an out-of-bounds access, null or dangling pointer dereference, or use of uninitialized memory inside a module for which the campaign claims a machine-checked proof, or demonstration that one of the integrated tools accepts such an error.

Figures

Figures reproduced from arXiv: 2606.17374 by Bart Jacobs, Byron Cook, Carolyn Zech, Celina Val, Felipe R. Monteiro, Michael Tautschnig, Rahul Kumar, Ranjit Jhala, Rebecca Rumbul, Remi Delmas, Thanh Nguyen, Zyad Hassan.

Figure 1
Figure 1. Figure 1: Growth of manually written function contracts over time (core + std), broken down by function category: unsafe functions (marked unsafe fn), safe abstractions (safe functions that internally use unsafe blocks), and safe (all other functions). The dashed line shows the subset of contracts whose proofs pass verification (always ≤ the total). The plateau after October 2025 motivated the development of automat… view at source ↗
Figure 2
Figure 2. Figure 2: Number of proof harnesses produced (not necessarily verified) by Autoharness vs. manual effort, across function categories (alloc + core + std, March 2026). Of the 16,748 automatically produced harnesses, 11,970 were successfully verified ( [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
read the original abstract

Rust's type system prevents many classes of memory errors, yet its standard library relies heavily on unsafe code whose correctness is validated through testing, including dynamic checks under Miri, but lacks static verification. We present what is, to the best of our knowledge, the largest verification campaign reported for a software library: an open, crowdsourced effort that integrates complementary verification tools into the continuous integration of a verification repository forked from the Rust standard library. We analyze the campaign's effectiveness, discuss the practical value of machine-checked proofs for a subset of undefined behaviors (e.g., out-of-bounds access, null and dangling pointer dereferences, and use of uninitialized memory), and frame the remaining obstacles as open challenges for the formal-methods community.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript presents a crowdsourced verification campaign for the Rust standard library: a forked repository integrates complementary verification tools into continuous integration, claimed as the largest such effort reported for a software library. It analyzes the campaign's effectiveness, discusses the practical value of machine-checked proofs for a subset of undefined behaviors (out-of-bounds access, null/dangling pointer dereferences, use of uninitialized memory), and frames remaining obstacles as open challenges.

Significance. If the soundness assumptions hold and quantitative results substantiate the claims, the work would represent a notable engineering achievement in scaling formal verification to a large, widely-used library containing unsafe code, providing concrete evidence of the feasibility of tool integration for real-world Rust verification.

major comments (2)
  1. [Abstract] Abstract: the claim of analyzing the campaign's effectiveness is not supported by any quantitative results, coverage metrics, or tool-specific soundness arguments in the text; this prevents evaluation of the central claim that the effort produces machine-checked proofs for the targeted UB classes.
  2. [Abstract / Setup] Setup description (implied in abstract and introduction): the soundness of the integrated verification tools w.r.t. Rust's operational semantics is assumed without argument, reference to a machine-checked semantics, or discussion of how CI integration preserves soundness across tool combinations; this is load-bearing for interpreting reported results as evidence that the targeted UB classes are absent.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and constructive feedback. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim of analyzing the campaign's effectiveness is not supported by any quantitative results, coverage metrics, or tool-specific soundness arguments in the text; this prevents evaluation of the central claim that the effort produces machine-checked proofs for the targeted UB classes.

    Authors: We agree that the abstract's reference to analyzing effectiveness would benefit from clearer pointers to supporting details. The manuscript body describes the campaign outcomes, including which verification tools succeeded on which library components and the specific obstacles encountered for the targeted UB classes. To strengthen the presentation, we will revise the abstract to explicitly reference the quantitative elements already present (such as the scale of the forked repository and the number of functions addressed by each tool) and add a new subsection summarizing coverage metrics drawn from the CI runs. revision: yes

  2. Referee: [Abstract / Setup] Setup description (implied in abstract and introduction): the soundness of the integrated verification tools w.r.t. Rust's operational semantics is assumed without argument, reference to a machine-checked semantics, or discussion of how CI integration preserves soundness across tool combinations; this is load-bearing for interpreting reported results as evidence that the targeted UB classes are absent.

    Authors: The manuscript relies on the established soundness claims of the individual tools as reported in their source publications rather than providing a new combined proof. No complete machine-checked operational semantics for Rust exists in the literature, which precludes referencing one. We will add an explicit subsection to the setup section that states the per-tool soundness assumptions, acknowledges the lack of a unified semantics, and explains that the CI configuration runs each tool independently on the same code base without composing their analyses, thereby preserving the individual soundness guarantees. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical engineering report

full rationale

The paper reports on a crowdsourced verification campaign integrating tools into CI for a forked Rust stdlib. It makes no mathematical derivations, predictions, or first-principles claims that could reduce to inputs by construction. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear. The central claim is the scale and practical value of the effort; soundness of tools is an external assumption, not a derived result within the paper. This matches the default case of a self-contained empirical report.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is a report on a practical verification campaign rather than a theoretical derivation, so the ledger contains only the background assumption required for the value of the proofs.

axioms (1)
  • domain assumption The verification tools used are sound for the targeted undefined behaviors in Rust
    The usefulness of the machine-checked proofs depends on the tools correctly modeling the relevant parts of Rust semantics.

pith-pipeline@v0.9.1-grok · 5685 in / 1257 out tokens · 32639 ms · 2026-06-26T22:56:56.670622+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 24 canonical work pages

  1. [1]

    In: Pro- ceedings of the 38th International Conference on Software Engineering Companion

    Anderson, B., Bergstrom, L., Goregaokar, M., Matthews, J., McAllister, K., Mof- fitt, J., Sapin, S.: Engineering the servo web browser engine using rust. In: Pro- ceedings of the 38th International Conference on Software Engineering Companion. p. 81–89. ICSE ’16, Association for Computing Machinery, New York, NY, USA (2016).https://doi.org/10.1145/2889160.2889229

  2. [2]

    In: NASA Formal Methods: 14th International Symposium, NFM 2022, Pasadena, CA, USA, May 24–27, 2022, Proceedings

    Astrauskas, V., B´ ıl´ y, A., Fiala, J., Grannan, Z., Matheja, C., M¨ uller, P., Poli, F., Summers, A.J.: The prusti project: Formal verification for rust. In: NASA Formal Methods: 14th International Symposium, NFM 2022, Pasadena, CA, USA, May 24–27, 2022, Proceedings. p. 88–108. Springer-Verlag, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-03...

  3. [3]

    ACM Program

    Astrauskas, V., Matheja, C., Poli, F., M¨ uller, P., Summers, A.J.: How do pro- grammers use unsafe rust? Proc. ACM Program. Lang.4(OOPSLA) (Nov 2020). https://doi.org/10.1145/3428204

  4. [4]

    Ayoun, S.E., Denis, X., Maksimovi´ c, P., Gardner, P.: A hybrid approach to semi- automated rust verification. Proc. ACM Program. Lang.9(PLDI) (Jun 2025). https://doi.org/10.1145/3729289

  5. [5]

    ACM Transactions on Programming Lan- guages and Systems (2025),https://project-everest.github.io/assets/ everest-perspectives-2025.pdf, to appear

    Bhargavan, K., Bond, B., Delignat-Lavaud, A., Fournet, C., Hawblitzel, C., Hrit ¸cu, C., Ishtiaq, S., Kohlweiss, M., Leino, R., Lorch, J., Maillard, K., Parno, B., Protzenko, J., Ramananandro, T., Rastogi, A., Swamy, N., Zanella-B´ eguelin, S.: Project Everest: Perspectives from developing industrial- grade high-assurance software. ACM Transactions on Pro...

  6. [6]

    In: Verified Software

    Bhargavan, K., Buyse, M., Franceschino, L., Hansen, L.L., Kiefer, F., Schneider- Bensch, J., Spitters, B.: hax: Verifying security-critical Rust software using multiple provers. In: Verified Software. Theories, Tools and Experiments (VSTTE). pp. 96–

  7. [7]

    Springer (2025).https://doi.org/10.1007/978-3-031-86695-1_7

  8. [8]

    In: Proceedings of the 2025 ACM SIGSAC Conference on Computer and Com- munications Security

    Bhargavan, K., Hansen, L.L., Kiefer, F., Schneider-Bensch, J., Spitters, B.: Formal security and functional verification of cryptographic protocol implementations in rust. In: Proceedings of the 2025 ACM SIGSAC Conference on Computer and Com- munications Security. p. 2729–2743. CCS ’25, Association for Computing Machin- ery, New York, NY, USA (2025).https...

  9. [9]

    Blanc, A.L., Lam, P.: Lessons learned so far from a community effort to verify the Rust standard library (work-in-progress).https://arxiv.org/abs/2510.01072 (2025) 20 Cook et al

  10. [10]

    In: Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation

    Boos, K., Liyanage, N., Ijaz, R., Zhong, L.: Theseus: an experiment in operating system structure and state management. In: Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation. OSDI’20, USENIX Association, USA (2020)

  11. [11]

    InProceedings of the 25th ACM Symposium on Operating Systems Principles (SOSP)

    Chen, H., Ziegler, D., Chajed, T., Chlipala, A., Kaashoek, M.F., Zeldovich, N.: Using crash hoare logic for certifying the FSCQ file system. In: Proceedings of SOSP. pp. 18–37 (2015).https://doi.org/10.1145/2815400.2815402

  12. [12]

    Soft- ware: Practice and Experience51(4), 772–797 (2021).https://doi.org/https: //doi.org/10.1002/spe.2949

    Chong, N., Cook, B., Eidelman, J., Kallas, K., Khazem, K., Monteiro, F.R., Schwartz-Narbonne, D., Tasiran, S., Tautschnig, M., Tuttle, M.R.: Code-level model checking in the software development workflow at amazon web services. Soft- ware: Practice and Experience51(4), 772–797 (2021).https://doi.org/https: //doi.org/10.1002/spe.2949

  13. [13]

    In: Proceedings of the ACM/IEEE 42nd Inter- national Conference on Software Engineering: Software Engineering in Practice

    Chong, N., Cook, B., Kallas, K., Khazem, K., Monteiro, F.R., Schwartz-Narbonne, D., Tasiran, S., Tautschnig, M., Tuttle, M.R.: Code-level model checking in the software development workflow. In: Proceedings of the ACM/IEEE 42nd Inter- national Conference on Software Engineering: Software Engineering in Practice. p. 11–20. ICSE-SEIP ’20, Association for Co...

  14. [14]

    In: Chockler, H., Weissenbacher, G

    Chudnov, A., Collins, N., Cook, B., Dodds, J., Huffman, B., MacC´ arthaigh, C., Magill, S., Mertens, E., Mullen, E., Tasiran, S., Tomb, A., Westbrook, E.: Con- tinuous formal verification of amazon s2n. In: Chockler, H., Weissenbacher, G. (eds.) Computer Aided Verification. pp. 430–446. Springer International Publish- ing, Cham (2018)

  15. [15]

    Cook, B., Khazem, K., Kroening, D., Tasiran, S., Tautschnig, M., Tuttle, M.R.: Model checking boot code from aws data centers. Form. Methods Syst. Des.57(1), 34–52 (Jul 2021).https://doi.org/10.1007/s10703-020-00344-2

  16. [16]

    In: Proceedings of ICSE (2024).https://doi.org/10.1145/3597503.3639136

    Cui, M., Sun, S., Xu, H., Zhou, Y.: Is unsafe an achilles’ heel? A comprehensive study of safety requirements in unsafe Rust programming. In: Proceedings of ICSE (2024).https://doi.org/10.1145/3597503.3639136

  17. [17]

    Dang, H.H., Jourdan, J.H., Kaiser, J.O., Dreyer, D.: RustBelt meets relaxed mem- ory. Proc. ACM Program. Lang.4(POPL) (2019).https://doi.org/10.1145/ 3371102

  18. [18]

    In: Formal Methods and Software Engineering (ICFEM)

    Denis, X., Jourdan, J.H., March´ e, C.: Creusot: A foundry for the deductive verification of Rust programs. In: Formal Methods and Software Engineering (ICFEM). LNCS, vol. 13478, pp. 90–105 (2022).https://doi.org/10.1007/ 978-3-031-17244-1_6

  19. [19]

    In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering

    Gadelha, M.R., Monteiro, F.R., Morse, J., Cordeiro, L.C., Fischer, B., Nicole, D.A.: Esbmc 5.0: an industrial-strength c model checker. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. p. 888–891. ASE ’18, Association for Computing Machinery, New York, NY, USA (2018).https://doi.org/10.1145/3238147.3240481

  20. [20]

    G¨ aher, L., Sammler, M., Jung, R., Krebbers, R., Dreyer, D.: RefinedRust: A type system for high-assurance verification of Rust programs. Proc. ACM Program. Lang.8(PLDI) (2024).https://doi.org/10.1145/3656422

  21. [21]

    In: Proceedings of the 35th ACM SIG- PLAN Conference on Programming Language Design and Implementation (PLDI)

    Greenaway, D., Lim, J., Andronick, J., Klein, G.: Don’t sweat the small stuff: For- mal verification of C code without the pain. In: Proceedings of the 35th ACM SIG- PLAN Conference on Programming Language Design and Implementation (PLDI). pp. 429–439 (2014).https://doi.org/10.1145/2594291.2594296

  22. [22]

    In: Proceedings of OSDI

    Gu, R., Shao, Z., Chen, H., Wu, X., Kim, J., Sj¨ oberg, V., Costanzo, D.: Cer- tiKOS: An extensible architecture for building certified concurrent OS kernels. In: Proceedings of OSDI. pp. 653–669 (2016) Verifying the Rust Standard Library 21

  23. [23]

    In: Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI)

    Hawblitzel, C., Howell, J., Lorch, J.R., Narayan, A., Parno, B., Zhang, D., Zill, B.: Ironclad apps: End-to-end security via automated full-system verification. In: Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI). pp. 165–181. USENIX Association (2014)

  24. [24]

    ACM Comput

    Huang, L., Ebersold, S., Kogtenkov, A., Meyer, B., Liu, Y.: Lessons from formally verified deployed software systems. ACM Comput. Surv.58(8) (Feb 2026).https: //doi.org/10.1145/3785652

  25. [25]

    Jour- nal of Object Technology25(3) (2025).https://doi.org/10.5381/jot.2025.25

    Jacobs, B., Fasse, J.: An approach for modularly verifying the core of Rust’s atomic reference counting algorithm against the (Y)C20 memory consistency model. Jour- nal of Object Technology25(3) (2025).https://doi.org/10.5381/jot.2025.25. 3.a5

  26. [26]

    In: Proceed- ings of the Third International Conference on NASA Formal Methods

    Jacobs, B., Smans, J., Philippaerts, P., Vogels, F., Penninckx, W., Piessens, F.: Verifast: a powerful, sound, predictable, fast verifier for c and java. In: Proceed- ings of the Third International Conference on NASA Formal Methods. p. 41–55. NFM’11, Springer-Verlag, Berlin, Heidelberg (2011)

  27. [27]

    Jung, R., Dang, H.H., Kang, J., Dreyer, D.: Stacked borrows: an aliasing model for rust. Proc. ACM Program. Lang.4(POPL) (Dec 2019).https://doi.org/10. 1145/3371109

  28. [28]

    Jung, R., Jourdan, J.H., Krebbers, R., Dreyer, D.: Rustbelt: securing the foun- dations of the rust programming language. Proc. ACM Program. Lang.2(POPL) (Dec 2017).https://doi.org/10.1145/3158154

  29. [29]

    Jung, R., Jourdan, J.H., Krebbers, R., Dreyer, D.: Safe systems programming in rust. Commun. ACM64(4), 144–152 (Mar 2021).https://doi.org/10.1145/ 3418295

  30. [30]

    Jung, R., Kimock, B., Poveda, C., S´ anchez Mu˜ noz, E., Scherer, O., Wang, Q.: Miri: Practical undefined behavior detection for Rust. Proc. ACM Program. Lang. 10(POPL) (2026).https://doi.org/10.1145/3776690

  31. [31]

    In: Proceedings of SOSP

    Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., Elka- duwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Win- wood, S.: seL4: Formal verification of an OS kernel. In: Proceedings of SOSP. pp. 207–220 (2009).https://doi.org/10.1145/1629575.1629596

  32. [32]

    Lattuada, A., Hance, T., Cho, C., Brun, M., Subasinghe, I., Zhou, Y., Howell, J., Parno, B., Hawblitzel, C.: Verus: Verifying Rust programs using linear ghost types. Proc. ACM Program. Lang.7(OOPSLA2) (2023).https://doi.org/10. 1145/3586037

  33. [33]

    Lehmann, N., Geller, A.T., Vazou, N., Jhala, R.: Flux: Liquid types for rust. Proc. ACM Program. Lang.7(PLDI) (Jun 2023).https://doi.org/10.1145/3591283

  34. [34]

    doi:10.1145/1538788.1538814 Xupeng Li, Xuheng Li, Wei Qiang, Ronghui Gu, and Jason Nieh

    Leroy, X.: Formal verification of a realistic compiler. Communications of the ACM 52(7), 107–115 (2009).https://doi.org/10.1145/1538788.1538814

  35. [35]

    In: Proceedings of the 26th Symposium on Operating Systems Principles

    Levy, A., Campbell, B., Ghena, B., Giffin, D.B., Pannuto, P., Dutta, P., Levis, P.: Multiprogramming a 64kb computer safely and efficiently. In: Proceedings of the 26th Symposium on Operating Systems Principles. p. 234–251. SOSP ’17, Association for Computing Machinery, New York, NY, USA (2017).https://doi. org/10.1145/3132747.3132786

  36. [36]

    rust-lang.org/reference/behavior-considered-undefined.html(2024)

    Rust Team: The rust reference: Behavior considered undefined.https://doc. rust-lang.org/reference/behavior-considered-undefined.html(2024)

  37. [37]

    In: Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice

    VanHattum, A., Schwartz-Narbonne, D., Chong, N., Sampson, A.: Verifying dy- namic trait objects in Rust. In: Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice. pp. 321–330. ICSE- SEIP ’22, Association for Computing Machinery (2022).https://doi.org/10. 1145/3510457.3513031 22 Cook et al

  38. [38]

    Villani, N., Hostert, J., Dreyer, D., Jung, R.: Tree borrows. Proc. ACM Program. Lang.9(PLDI) (2025).https://doi.org/10.1145/3735592

  39. [39]

    In: 2018 International Symposium on Theoretical Aspects of Software Engineering (TASE)

    Wang, F., Song, F., Zhang, M., Zhu, X., Zhang, J.: KRust: A Formal Executable Semantics of Rust . In: 2018 International Symposium on Theoretical Aspects of Software Engineering (TASE). pp. 44–51. IEEE Computer Society, Los Alamitos, CA, USA (Aug 2018).https://doi.org/10.1109/TASE.2018.00014