CVA6-RT: an Open-Source Time-Predictable RV64 Processor for Mixed-Criticality Systems

Alessandro Ottaviano; Angelo Garofalo; Christopher Reinwardt; Enrico Zelioli; Luca Benini; Nils Wistoff; Robert Balas

arxiv: 2606.26177 · v1 · pith:PH3KJ3JInew · submitted 2026-06-24 · 💻 cs.AR

CVA6-RT: an Open-Source Time-Predictable RV64 Processor for Mixed-Criticality Systems

Enrico Zelioli , Christopher Reinwardt , Nils Wistoff , Robert Balas , Alessandro Ottaviano , Luca Benini , Angelo Garofalo This is my paper

Pith reviewed 2026-06-26 01:00 UTC · model grok-4.3

classification 💻 cs.AR

keywords CVA6-RTRISC-V processorreal-time extensionsmixed-criticality systemsinterrupt latencyTLB partitioningscratchpad cachetime predictability

0 comments

The pith

CVA6-RT adds TLB locks, scratchpad caches and hardware context stacking to the CVA6 core for 12-cycle interrupt latency in real-time use.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CVA6-RT as a real-time micro-architectural extension of the open-source CVA6 RISC-V processor aimed at mixed-criticality systems. It adds TLB partitioning and locking to control address translation timing, a mode that turns L1 caches into scratchpads for fixed memory access, and an enhanced interrupt controller with hardware context stacking. These changes target bounded worst-case latencies and lower timing variability so that a full 64-bit core can meet real-time requirements. The central measured outcome is that interrupt latency reaches 12 cycles when the features are active, matching basic Arm Cortex-M cores and cutting the baseline CVA6 latency by a factor of 10.

Core claim

CVA6-RT implements the rv64gch ISA and features advanced support for real-time execution, including TLB partitioning and locking for predictable address translation, a dynamically reconfigurable scratchpad mode in the L1 caches for deterministic memory access, and low-latency interrupt handling via an enhanced interrupt controller combined with hardware-assisted context stacking. With real-time features enabled, CVA6-RT achieves an interrupt latency of 12 cycles, comparable to that of simpler Arm Cortex-M microcontrollers, and 10x lower than the baseline CVA6 core.

What carries the argument

The set of micro-architectural extensions consisting of TLB partitioning and locking, dynamically reconfigurable scratchpad L1 caches, and an enhanced interrupt controller with hardware-assisted context stacking.

If this is right

Worst-case execution times for critical tasks become bounded even with complex memory systems active.
Interrupt response on a 64-bit open-source core reaches speeds typical of simpler microcontrollers.
The processor supports mode switching between high-performance and deterministic operation.
Open-source RISC-V cores gain practical use in systems that previously required proprietary real-time hardware.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The scratchpad reconfiguration might allow runtime allocation of deterministic memory regions to the most critical tasks.
Comparable extensions could be explored on other open RISC-V cores to close the predictability gap with commercial parts.
Full-system tests with actual mixed-criticality applications would be needed to confirm the reported bounds hold end-to-end.

Load-bearing premise

The added hardware features will bound worst-case execution latency and reduce timing variability in actual mixed-criticality workloads without introducing new sources of unpredictability or unacceptable overhead.

What would settle it

Running the processor with real-time features enabled on representative mixed-criticality benchmarks and measuring an interrupt latency above 12 cycles or higher timing variability than the baseline.

Figures

Figures reproduced from arXiv: 2606.26177 by Alessandro Ottaviano, Angelo Garofalo, Christopher Reinwardt, Enrico Zelioli, Luca Benini, Nils Wistoff, Robert Balas.

**Figure 1.** Figure 1: CVA6-RT block diagram with enhanced modules highlighted. • Runtime-configurable L1 instruction and data cache resources in scratchpad mode for deterministic memory access latency; • Deterministic low-latency interrupt handling via an enhanced RISC-V CLIC and hardware-assisted register stacking for fast context switch. Using interrupt latency as a representative use case, we show that CVA6-RT achieves 12 c… view at source ↗

**Figure 2.** Figure 2: Average interrupt latency breakdown. a hybrid cache/scratchpad (SPM) mode in the L1 instruction and data caches. Each cache way can be dynamically configured either as a conventional cache way or as software-managed scratchpad memory. Ways assigned to SPM are removed from the cache replacement logic, and their tags and valid bits are cleared to prevent unintended cache hits. Address decoding logic in the… view at source ↗

read the original abstract

This work presents CVA6-RT, a real-time micro-architectural extension of the CVA6 core to bound worst-case latency and reduce task's timing execution variability. CVA6-RT implements the rv64gch ISA and features advanced support for real-time execution, including TLB partitioning and locking for predictable address translation, a dynamically reconfigurable scratchpad mode in the L1 caches for deterministic memory access, and low-latency interrupt handling via an enhanced interrupt controller combined with hardware-assisted context stacking. With real-time features enabled, CVA6-RT achieves an interrupt latency of 12 cycles, comparable to that of simpler Arm Cortex-M microcontrollers, and 10x lower than the baseline CVA6 core.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CVA6-RT adds known real-time extensions to the open CVA6 core and claims 12-cycle interrupt latency, but the abstract supplies no measurement details or evidence.

read the letter

The main takeaway is that this paper describes CVA6-RT, a variant of the existing CVA6 RISC-V core with added support for time-predictable execution. It includes TLB partitioning and locking, a reconfigurable scratchpad mode for the L1 caches, and an enhanced interrupt controller with hardware context stacking, all on the rv64gch ISA. With the features on, it reports 12-cycle interrupt latency, 10x better than the baseline.

The work applies established techniques rather than inventing new ones, but the concrete open-source implementation for this core in a mixed-criticality setting is the actual addition. Being open-source is useful for groups that need a starting point for safety-critical RISC-V designs in automotive or industrial domains.

The description of the micro-arch changes is clear enough in the abstract. However, the latency claim appears without any reference to test methodology, benchmarks, error bars, or checks for new sources of variability from the extensions themselves. That gap makes it hard to judge whether the features deliver on the promise in practice.

The paper targets hardware designers and real-time systems engineers working with open processors. A reader already familiar with CVA6 or looking for predictable RV64 options would get the most from the design details, assuming the full version includes the implementation and evaluation.

It deserves peer review because the topic matters and the core is open, even though the evaluation section will need close checking.

Referee Report

2 major / 2 minor

Summary. This paper presents CVA6-RT, a real-time micro-architectural extension of the open-source CVA6 RV64 core implementing the rv64gch ISA. It adds TLB partitioning and locking for predictable address translation, a dynamically reconfigurable scratchpad mode in the L1 caches for deterministic memory access, and an enhanced interrupt controller with hardware-assisted context stacking. The central quantitative claim is that, with real-time features enabled, CVA6-RT achieves an interrupt latency of 12 cycles—comparable to simpler Arm Cortex-M microcontrollers and 10x lower than the baseline CVA6 core—for use in mixed-criticality systems.

Significance. If the performance claims are substantiated, the work would be significant for delivering an open-source, time-predictable RISC-V processor suitable for safety-critical and mixed-criticality applications. The specific 10x reduction in interrupt latency and the combination of features for bounding worst-case latency represent a practical contribution to the field of predictable hardware.

major comments (2)

[Abstract] Abstract: The claim of a 12-cycle interrupt latency is presented without any measurement methodology, benchmarks, error analysis, or verification approach. This prevents assessment of whether the data support the central performance claim.
[Micro-architectural extensions] The descriptions of TLB partitioning/locking, dynamically reconfigurable scratchpad L1 caches, and the enhanced interrupt controller do not include analysis demonstrating that these extensions bound worst-case execution latency and reduce timing variability without introducing new sources of unpredictability or unacceptable overhead in mixed-criticality workloads.

minor comments (2)

The abstract and introduction would benefit from explicit cross-references to the sections containing the evaluation methodology and results that support the 12-cycle latency figure.
Consider clarifying the exact configuration of the baseline CVA6 used for the 10x comparison (e.g., cache sizes, pipeline details) to allow direct reproduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and outline the planned revisions.

read point-by-point responses

Referee: [Abstract] Abstract: The claim of a 12-cycle interrupt latency is presented without any measurement methodology, benchmarks, error analysis, or verification approach. This prevents assessment of whether the data support the central performance claim.

Authors: We agree that the abstract would benefit from additional context. The full manuscript details the measurement approach in the evaluation section, using cycle-accurate RTL simulation, targeted interrupt-injection benchmarks, and trace-based verification. We will revise the abstract to concisely reference the simulation environment and benchmark methodology supporting the 12-cycle figure. revision: yes
Referee: [Micro-architectural extensions] The descriptions of TLB partitioning/locking, dynamically reconfigurable scratchpad L1 caches, and the enhanced interrupt controller do not include analysis demonstrating that these extensions bound worst-case execution latency and reduce timing variability without introducing new sources of unpredictability or unacceptable overhead in mixed-criticality workloads.

Authors: The manuscript includes design rationale and initial quantitative results on reduced variability. We acknowledge, however, that a more explicit analysis of worst-case latency bounding, potential new unpredictability sources, and overhead under mixed-criticality workloads is needed. We will add a dedicated subsection with this analysis and supporting experiments in the revised version. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The document is a hardware design description of micro-architectural extensions to CVA6. It presents implementation features (TLB partitioning, scratchpad caches, interrupt controller) and states measured results such as 12-cycle interrupt latency. No equations, parameter fitting, predictions derived from inputs, or self-citation chains appear in the provided abstract or described content. The central claims rest on engineering implementation and benchmarking rather than any derivation that reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical model, free parameters, axioms, or invented entities are described; the contribution is a hardware microarchitecture extension.

pith-pipeline@v0.9.1-grok · 5680 in / 1103 out tokens · 40058 ms · 2026-06-26T01:00:36.328775+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

4 extracted references

[1]

A Survey on Cache Management Mecha- nisms for Real-Time Embedded Systems

G. Gracioli et al. “A Survey on Cache Management Mecha- nisms for Real-Time Embedded Systems”. In:ACM Com- put. Surv. 48.2 (2015)

2015
[2]

Stellar Automotive Microcontrollers

STMicroelectronics. Stellar Automotive Microcontrollers. 2023

2023
[3]

A Beginner’s Guide on Interrupt Latency of the Arm Cortex-M processors

ARM Limited. A Beginner’s Guide on Interrupt Latency of the Arm Cortex-M processors . 2016

2016
[4]

The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux- Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology

F. Zaruba and L. Benini. “The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux- Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology”. In:IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27.11 (2019). 2 RISC-V Summit Europe, Bologna, 8-12th June 2026

2019

[1] [1]

A Survey on Cache Management Mecha- nisms for Real-Time Embedded Systems

G. Gracioli et al. “A Survey on Cache Management Mecha- nisms for Real-Time Embedded Systems”. In:ACM Com- put. Surv. 48.2 (2015)

2015

[2] [2]

Stellar Automotive Microcontrollers

STMicroelectronics. Stellar Automotive Microcontrollers. 2023

2023

[3] [3]

A Beginner’s Guide on Interrupt Latency of the Arm Cortex-M processors

ARM Limited. A Beginner’s Guide on Interrupt Latency of the Arm Cortex-M processors . 2016

2016

[4] [4]

The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux- Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology

F. Zaruba and L. Benini. “The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux- Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology”. In:IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27.11 (2019). 2 RISC-V Summit Europe, Bologna, 8-12th June 2026

2019