arxiv: 2605.04933 · v1 · submitted 2026-05-06 · 💻 cs.PL

Recognition: unknown

Interaction Tree Semantics for RISC-V: Bridging Compiler and Hardware Verification

Shuanglong Kan , Sebastian Ertel

Authors on Pith no claims yet

Pith reviewed 2026-05-08 15:57 UTC · model grok-4.3

classification 💻 cs.PL

keywords RISC-Vformal semanticsInteraction TreesRocqcross-level verificationcompiler verificationhardware verificationISA

0 comments

The pith

RISC-V formal semantics using Interaction Trees in Rocq enables machine-checked proofs connecting compilers to hardware.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a formal semantics for the RISC-V ISA using Interaction Trees in the Rocq proof assistant. This semantics supports verification across levels by allowing bisimulation and refinement relations between compiler IR, the ISA, and hardware designs. It covers a wide range of RISC-V extensions with machine-checked lemmas establishing the correctness of individual instruction semantics. An executable simulator is extracted from the semantics and passes all standard RISC-V test suites. Three case studies demonstrate its use for proving equivalence between LLVM IR and RISC-V code, validating safe instruction reorderings, and confirming that a hardware ALU matches the ISA contract for R-type operations.

Core claim

The central claim is that the RISC-V ISA can be modeled as Interaction Trees to provide a unified formal contract that supports cross-level verification. The model is used to prove semantic equivalence via bisimulation between LLVM IR and RISC-V implementations of array access patterns, to apply translation validation that separates safe macro-operation fusion reorderings from those that break program-counter-relative addressing, and to show that a Koika hardware ALU correctly implements all R-type integer operations against the ISA specification. The formalization includes machine-checked lemmas for instruction correctness and is validated by an extracted simulator that passes standard test

What carries the argument

Interaction Trees, which encode program behaviors with effects and nondeterminism to support bisimulation and refinement relations for relating different levels of implementation.

Load-bearing premise

The Interaction Tree model must accurately represent every observable behavior required by the RISC-V ISA specification, including all edge cases in memory ordering, exception handling, and extension interactions.

What would settle it

A concrete program or test input where the extracted simulator produces results different from the official RISC-V reference implementation, or where one of the three case-study proofs fails to establish the claimed relation.

read the original abstract

The Instruction Set Architecture (ISA) is the contract between compilers and processors; proving this contract formally demands cross-level connection to existing mechanized compilers and hardware implementations. As an open, modular ISA gaining adoption across embedded, mobile, and cloud platforms, RISC-V makes a formally verified ISA specification particularly valuable. However, existing formal RISC-V specifications focus on hardware tooling rather than cross-level verification: they provide no machine-checked instruction-level properties and lack support for verifying this contract across levels. We address these limitations with a formal semantics of the RISC-V ISA in Rocq, built on Interaction Trees (ITrees). By leveraging ITree bisimulation and refinement, our semantics enables cross-level verification from compiler IR to hardware within a single framework. Our formalization covers a wide spectrum of RISC-V extensions. The correctness of individual instruction semantics is backed by machine-checked lemmas in Rocq. We further validate it by extracting an executable simulator that passes all standard RISC-V test suites. Three case studies demonstrate the effectiveness of our semantics for cross-level verification: first, we prove semantic equivalence via bisimulation between LLVM IR and RISC-V code on an array access pattern via Vellvm (LLVM ITree semantics); second, we apply translation validation to a specific instruction reordering for macro-operation fusion, distinguishing safe reorderings from those that break program-counter-relative addressing; third, we prove that a K\^oika hardware ALU correctly implements all R-type integer operations (e.g., ADD, SUB, AND) against our ISA contract.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This Rocq ITree formalization of RISC-V gives machine-checked instruction lemmas, an extracted simulator that passes standard tests, and three narrow cross-level case studies, but its claim to capture the full ISA contract rests on limited exercised paths rather than exhaustive equivalence.

read the letter

The paper's core contribution is an Interaction Tree semantics for RISC-V inside Rocq that supports bisimulation and refinement across compiler IR, the ISA, and hardware models. They extract an executable simulator from it and show it passes the usual RISC-V test suites. Three case studies follow: bisimulation between Vellvm LLVM IR and RISC-V on array access, a check that one specific instruction reordering for macro fusion preserves or breaks PC-relative behavior, and a proof that a Kôika ALU implements the R-type integer ops against the ISA contract. Machine-checked lemmas back the individual instruction semantics, and they state coverage of a wide range of extensions. That combination of formal lemmas plus runnable validation in one framework is the concrete advance over prior RISC-V specs that stayed hardware-focused. The work is solid on what it actually checks. The lemmas and the passing simulator give direct evidence for the exercised behaviors, and the case studies are small but reproducible within the ITree setting. The stress-test concern holds up on the evidence given: the central faithfulness claim depends on test-suite agreement and three chosen examples rather than a full observational-equivalence argument that would cover untested memory-ordering cases, exception paths, or extension interactions outside those suites. The abstract does not detail how the case studies were selected or what coverage gaps remain, so referees would need to see the full development to judge how much is left open. This is for people already working with Rocq, ITrees, or verified compilers and hardware for open ISAs. A reading group focused on formal methods or RISC-V tooling would get value from the artifacts. It has enough machine-checked content and executable results to deserve peer review rather than a desk reject, even if the coverage discussion needs tightening.

Referee Report

2 major / 2 minor

Summary. The paper presents a formal semantics for the RISC-V ISA in Rocq based on Interaction Trees. It supplies machine-checked lemmas for the correctness of individual instruction semantics, extracts an executable simulator that passes standard RISC-V test suites, and demonstrates cross-level verification through three case studies: bisimulation between LLVM IR and RISC-V on an array access pattern, translation validation for instruction reordering, and equivalence between a Kôika hardware ALU and the ISA contract for R-type integer operations.

Significance. If the semantics is faithful on the exercised behaviors, the work is significant for supplying a single ITree framework that connects compiler IR, ISA, and hardware verification. The machine-checked lemmas and passing simulator provide direct, reproducible evidence for the central claim on covered paths. The case studies illustrate practical bridging across levels, which is valuable for an open, modular ISA like RISC-V.

major comments (2)

[§3] §3 (ITree semantics definition): the machine-checked lemmas establish properties for individual instructions, but the manuscript provides no explicit theorem or table enumerating which RISC-V behaviors (including exceptions, memory ordering, and extension interactions) are fully captured versus left undefined or unformalized. This is load-bearing for the claim that the semantics constitutes a faithful ISA contract usable for cross-level verification.
[§5] §5 (case studies): the three demonstrations are narrow (array access via Vellvm, one reordering example, and R-type ALU ops in Kôika). The manuscript does not discuss generalization to full programs or untested edge cases, which weakens the support for the broader 'bridging compiler and hardware verification' claim.

minor comments (2)

[Abstract] Abstract: the rendering of 'Kôika' contains a typesetting artifact ('K^oika'); correct the special character throughout.
[§2] §2 (related work): additional citations to existing RISC-V formalizations would better highlight the novelty of the ITree approach for cross-level rather than hardware-only verification.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment, the recommendation for minor revision, and the constructive comments on clarifying scope and generality. We address each major comment below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses

Referee: [§3] §3 (ITree semantics definition): the machine-checked lemmas establish properties for individual instructions, but the manuscript provides no explicit theorem or table enumerating which RISC-V behaviors (including exceptions, memory ordering, and extension interactions) are fully captured versus left undefined or unformalized. This is load-bearing for the claim that the semantics constitutes a faithful ISA contract usable for cross-level verification.

Authors: We agree that an explicit enumeration would improve clarity and better support the claim of a faithful ISA contract. In the revised version, we will add a table (and brief accompanying text) in §3 that enumerates the covered RISC-V extensions and instructions, the machine-checked lemmas provided for each, and explicitly identifies behaviors left undefined or partially modeled (e.g., full memory consistency models beyond sequential consistency for basic loads/stores, detailed exception semantics for certain traps, and cross-extension interactions not exercised in the current formalization). This table will be derived directly from the existing ITree definitions and the test-suite coverage already reported, without changing any lemmas or claims. revision: yes
Referee: [§5] §5 (case studies): the three demonstrations are narrow (array access via Vellvm, one reordering example, and R-type ALU ops in Kôika). The manuscript does not discuss generalization to full programs or untested edge cases, which weakens the support for the broader 'bridging compiler and hardware verification' claim.

Authors: The case studies are presented as targeted demonstrations of the ITree framework's ability to connect levels, leveraging the compositionality and bisimulation properties of ITrees. We acknowledge that the manuscript would benefit from explicit discussion of scope and generalization. In the revision, we will expand the end of §5 with a short subsection on limitations and extensibility: we will note that the modular structure (separate ITree semantics for each level) supports scaling via standard ITree combinators, that the demonstrated patterns (array access, reordering, ALU equivalence) are representative of common verification tasks, and that untested edge cases (e.g., full-program exception handling or complex memory-ordering scenarios) remain future work. This addition will be honest about current scope while reinforcing the bridging claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity; semantics and validations are independently constructed and externally checked

full rationale

The paper constructs the ITree semantics directly from the RISC-V ISA specification in Rocq. Individual instruction correctness is established by machine-checked lemmas. Validation uses an extracted simulator run against external standard RISC-V test suites plus three independent case studies (Vellvm bisimulation, reordering translation validation, Kôika ALU equivalence) that are proved within the framework. No derivation step reduces by construction to a fitted parameter, self-definition, or load-bearing self-citation chain; all central claims rest on external benchmarks and separate formal artifacts.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on the assumption that the RISC-V ISA document is the authoritative specification and that Interaction Trees can encode all required observable behaviors without loss.

axioms (1)

domain assumption The published RISC-V ISA specification is the complete and correct contract between software and hardware.
All lemmas and the simulator are constructed to match this contract.

pith-pipeline@v0.9.0 · 5578 in / 1294 out tokens · 67677 ms · 2026-05-08T15:57:55.997351+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

13 extracted references · 11 canonical work pages

[1]

WP-preserving compilation—preserving weakest preconditions for end-to-end verification

1 Carmine Abate, Mohamed Elsheikh, Kleio Liotati, František Farka, and Sebastian Ertel. WP-preserving compilation—preserving weakest preconditions for end-to-end verification. In Principles of Secure Compilation (PriSC 2026),

2026
[2]

Gray, Robert M

2 Alasdair Armstrong, Thomas Bauereiss, Brian Campbell, Alastair Reid, Kathryn E. Gray, Robert M. Norton, Prashanth Mundkur, Mark Wassell, Jon French, Christopher Pulte, Shaked Flur, Ian Stark, Neel Krishnaswami, and Peter Sewell. ISA semantics for ARMv8-A, RISC-V, and CHERI-MIPS.Proc. ACM Program. Lang., 3(POPL), 2019.doi:10.1145/3290384. 3 Alasdair Arms...

work page doi:10.1145/3290384 2019
[3]

Patterson

6 Krste Asanović and David A. Patterson. Instruction sets should be free: The case for RISC-V. Technical Report UCB/EECS-2014-146, EECS Department, University of California, Berkeley, August

2014
[4]

HITrees: Higher-order interaction trees.arXiv preprint arXiv:2510.14558,

7 Amir Mohammad Fadaei Ayyam and Michael Sammler. HITrees: Higher-order interaction trees.arXiv preprint arXiv:2510.14558,

work page arXiv
[5]

8 Thomas Bourgeat, Ian Clester, Andres Erbsen, Samuel Gruetter, Pratap Singh, Andrew Wright, and Adam Chlipala

URL:https://arxiv.org/abs/2510.14558. 8 Thomas Bourgeat, Ian Clester, Andres Erbsen, Samuel Gruetter, Pratap Singh, Andrew Wright, and Adam Chlipala. Flexible instruction-set semantics via abstract monads (experience report). Proc. ACM Program. Lang., 7(ICFP), 2023.doi:10.1145/3607833. 9 Thomas Bourgeat, Ian Clester, Andres Erbsen, Samuel Gruetter, Andrew...

work page doi:10.1145/3607833 2023
[6]

Making concurrent hardware verification sequential.Proc

10 Thomas Bourgeat, Jiazheng Liu, Adam Chlipala, and Arvind. Making concurrent hardware verification sequential.Proc. ACM Program. Lang., 9(PLDI), 2025.doi:10.1145/3729331. 11 Thomas Bourgeat, Clément Pit-Claudel, Adam Chlipala, and Arvind. The essence of bluespec: A core language for rule-based hardware design. InProceedings of the 41st ACM SIGPLAN Confe...

work page doi:10.1145/3729331 2025
[7]

Choice trees: Representing nondeterministic, recursive, and impure programs in Coq.Proc

13 Nicolas Chappe, Paul He, Ludovic Henrio, Yannick Zakowski, and Steve Zdancewic. Choice trees: Representing nondeterministic, recursive, and impure programs in Coq.Proc. ACM Program. Lang., 7(POPL), 2023.doi:10.1145/3571254. 14 Nicolas Chappe, Ludovic Henrio, and Yannick Zakowski. Monadic interpreters for concurrent memory models: Executable semantics o...

work page doi:10.1145/3571254 2023
[8]

19 Anthony C

doi:10.1007/978-3-642-32347-8_23. 19 Anthony C. J. Fox and Magnus O. Myreen. A trustworthy monadic formalization of the ARMv7 instruction set architecture. InInteractive Theorem Proving (ITP 2010), volume 6172 ofLecture Notes in Computer Science, pages 243–258. Springer, 2010.doi:10.1007/ 978-3-642-14052-5_18. 20 Dan Frumin, Amin Timany, and Lars Birkedal...

work page doi:10.1007/978-3-642-32347-8_23 2010
[9]

Kan and S

S. Kan and S. Ertel 23:19 32 George C. Necula. Translation validation for an optimizing compiler. InProceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation (PLDI 2000), pages 83–94. ACM, 2000.doi:10.1145/349299.349314. 33 David Park. Concurrency and automata on infinite sequences. InTheoretical Computer Science, 5t...

work page doi:10.1145/349299.349314 2000
[10]

Islaris: Verification of machine code against authoritative ISA semantics

37 Michael Sammler, Angus Hammond, Rodolphe Lepigre, Brian Campbell, Jean Pichon- Pharabod, Derek Dreyer, Deepak Garg, and Peter Sewell. Islaris: Verification of machine code against authoritative ISA semantics. InProceedings of the 43rd ACM SIGPLAN Interna- tional Conference on Programming Language Design and Implementation (PLDI 2022), pages 825–840. AC...

work page doi:10.1145/3519939.3523434 2022
[11]

40 Lucas Silver, Paul He, Ethan Cecchetti, Andrew K

Association for Computing Machinery.doi:10.1145/3677333.3678150. 40 Lucas Silver, Paul He, Ethan Cecchetti, Andrew K. Hirsch, and Steve Zdancewic. Semantics for noninterference with interaction trees. In37th European Conference on Object-Oriented Programming (ECOOP 2023), volume 263 ofLIPIcs, pages 29:1–29:29. Schloss Dagstuhl – Leibniz-Zentrum für Inform...

work page doi:10.1145/3677333.3678150 2023
[12]

42 Yong Kiam Tan, Magnus O

URL:https://arxiv.org/abs/2512.11577. 42 Yong Kiam Tan, Magnus O. Myreen, Ramana Kumar, Anthony Fox, Scott Owens, and Michael Norrish. A new verified compiler backend for CakeML. InProceedings of the 21st ACM SIGPLAN International Conference on Functional Programming (ICFP 2016), pages 60–73. ACM, 2016.doi:10.1145/2951913.2951924. 43 Yong Kiam Tan, Magnus...

work page doi:10.1145/2951913.2951924 2016
[13]

Pierce, and Steve Zdancewic

Technical Report UCB/EECS-2016-1. CVIT 2016 23:20 Interaction Tree Semantics for RISC-V 49 Li-yao Xia, Yannick Zakowski, Paul He, Chung-Kil Hur, Gregory Malecha, Benjamin C. Pierce, and Steve Zdancewic. Interaction trees: Representing recursive and impure programs in Coq. Proc. ACM Program. Lang., 4(POPL), 2020.doi:10.1145/3371119. 50 Yannick Zakowski, Ca...

work page doi:10.1145/3371119 2016