pith. machine review for the scientific record. sign in

arxiv: 2604.19824 · v1 · submitted 2026-04-20 · 💻 cs.SE

Recognition: unknown

Stateful Embedded Fuzzing with Peripheral-Accurate SystemC Virtual Prototypes

Authors on Pith no claims yet

Pith reviewed 2026-05-10 04:46 UTC · model grok-4.3

classification 💻 cs.SE
keywords embedded fuzzingSystemC-TLMvirtual prototypesAFL++peripheral modelingpre-silicon testingstateful simulationembedded software
0
0 comments X

The pith

Stateful SystemC-TLM virtual prototypes integrated with AFL++ enable realistic embedded software fuzzing that eliminates false positives.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a framework that links the AFL++ fuzzer to a full-system SystemC-TLM simulation so that generated inputs reach peripheral models directly. This allows peripherals to produce authentic responses such as interrupts and FIFO updates during execution. Existing fuzzing methods either simplify peripherals too much, creating misleading results, or demand manual setup that limits scale. The new integration aims to deliver accurate pre-silicon testing for embedded code while keeping the speed and coverage of prior tools. Experiments on embedded workloads confirm fewer false positives with no measurable loss in code coverage or runtime performance.

Core claim

By injecting fuzzer-generated inputs directly into peripheral models inside a stateful SystemC-TLM virtual prototype, the framework lets peripherals trigger natural side effects such as interrupts and FIFO updates. This full-system simulation approach supports fuzzing of realistic embedded software without the accuracy loss of fast user-mode simulators or the manual instrumentation burden of traditional full-system tools.

What carries the argument

The stateful SystemC-TLM virtual prototype, which models peripheral state transitions so that fuzzer inputs produce authentic hardware-like side effects inside the simulation.

If this is right

  • Pre-silicon testing of embedded software can proceed at larger scale with realistic peripheral interactions.
  • Fuzzing can be applied to full embedded systems without sacrificing peripheral accuracy or requiring heavy manual instrumentation.
  • False positives arising from simplified peripheral models are removed while code coverage and execution speed remain comparable to current tools.
  • The method supports testing before hardware fabrication by keeping simulation fidelity high enough to reflect real peripheral state.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If peripheral state accuracy drives the reduction in false positives, then refinements to virtual prototype timing models could further improve detection reliability.
  • The same injection pattern might transfer to other coverage-guided fuzzers or different simulation back-ends beyond SystemC.
  • This style of virtual-prototype fuzzing could support automated regression testing in embedded development pipelines by running in software-only environments.

Load-bearing premise

The SystemC-TLM virtual prototypes must accurately capture peripheral behaviors and state transitions without introducing simulation artifacts that would mask or create false issues.

What would settle it

A direct comparison of crash reports and coverage metrics produced by the framework against the same workload run on physical embedded hardware, checking whether reported issues match and false positives disappear.

Figures

Figures reproduced from arXiv: 2604.19824 by Chiara Ghinami, Igor Pontes Tresolavy, Luis Seibt, Nils Bosbach, Rainer Leupers.

Figure 1
Figure 1. Figure 1: Simulation-based fuzz testing. AFL++ [8], a community-maintained fork of the AFL fuzzer [23], is widely used in industry and research for coverage-guided testing. Recently, significant research has focused on enabling embedded software fuzzing by integrating AFL++ with simulators. As shown in [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The steps of the fuzzing workflow. 3 Fuzzing Framework In this section, we present our framework and the adaptations required in the VP so that it can efficiently work with the fuzzer. We then describe how the framework employs injector modules to send fuzz data to peripheral models. 3.1 Workflow Because AFL++ already supports QEMU-based fuzzing, no changes to the fuzzer were needed to add a new VP. In con… view at source ↗
Figure 4
Figure 4. Figure 4: No specific fuzzing seeds were given, each run began with a [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Code coverage comparison for the various tools. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of the execution/second of the three [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
read the original abstract

The increasing complexity of embedded software has made comprehensive manual testing impractical, motivating the use of automated techniques such as fuzzing. Coverage-guided fuzzers like AFL++ have shown strong results for conventional software but remain challenging to apply effectively in embedded contexts, where peripheral behaviors play critical roles. Existing approaches either use fast user-mode simulators, sacrificing peripheral realism, or rely on full-system simulators with manual instrumentation, limiting applicability to large-scale software. In this work, we present a novel framework that integrates AFL++ with a stateful SystemC-TLM virtual prototype to enable realistic fuzzing of embedded software. Fuzzer-generated inputs are injected directly into peripheral models, allowing peripherals to trigger natural side effects such as interrupts and FIFO updates. By integrating fuzzing with full-system simulation, our framework advances the effectiveness of pre-silicon testing for embedded systems. Results on embedded workloads show that our approach eliminates false positives while maintaining comparable code coverage and execution performance as state-of-the-art tools.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents a framework that integrates AFL++ with stateful SystemC-TLM virtual prototypes for fuzzing embedded software. Fuzzer inputs are injected directly into peripheral models to produce realistic side effects such as interrupts and FIFO updates, claiming to eliminate false positives from user-mode abstractions while achieving comparable code coverage and execution performance to state-of-the-art tools on embedded workloads.

Significance. If the empirical claims hold under rigorous validation, the work could meaningfully advance pre-silicon testing for embedded systems by enabling peripheral-accurate fuzzing without the typical trade-offs between simulation speed and realism.

major comments (2)
  1. Experimental Evaluation section: the central claim that the approach 'eliminates false positives' is asserted without any description of the methodology used to identify, count, or classify false positives, the specific workloads chosen, or statistical analysis of results; this leaves the primary contribution unsupported by visible evidence.
  2. Framework Integration section: the description of direct input injection into SystemC-TLM peripheral models and maintenance of state across fuzzing iterations lacks concrete details on implementation (e.g., how interrupts are triggered or how simulation state is reset between test cases), making it impossible to assess whether the claimed realism is achieved without introducing new artifacts.
minor comments (2)
  1. Abstract: the performance and coverage comparison is stated as 'comparable' but provides no quantitative metrics or baselines, which should be summarized even at a high level.
  2. The paper would benefit from a dedicated threats-to-validity subsection addressing the fidelity of the SystemC-TLM models used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight areas where the manuscript can be strengthened. We address each major comment below and will revise the paper to provide the requested details and clarifications.

read point-by-point responses
  1. Referee: Experimental Evaluation section: the central claim that the approach 'eliminates false positives' is asserted without any description of the methodology used to identify, count, or classify false positives, the specific workloads chosen, or statistical analysis of results; this leaves the primary contribution unsupported by visible evidence.

    Authors: We agree that the Experimental Evaluation section would benefit from an explicit description of the false-positive identification methodology. In the manuscript, false positives are characterized as crashes or anomalous behaviors observed under user-mode abstractions that do not occur on real hardware due to missing peripheral state; our results demonstrate zero such cases for the proposed approach across the evaluated embedded workloads while baselines exhibit them. To address the concern, we will add a dedicated subsection in the revised Experimental Evaluation that defines the classification criteria, lists the specific workloads (including benchmark names and sizes), and includes basic statistical reporting on the observed differences. This will make the supporting evidence fully transparent. revision: yes

  2. Referee: Framework Integration section: the description of direct input injection into SystemC-TLM peripheral models and maintenance of state across fuzzing iterations lacks concrete details on implementation (e.g., how interrupts are triggered or how simulation state is reset between test cases), making it impossible to assess whether the claimed realism is achieved without introducing new artifacts.

    Authors: We acknowledge that additional implementation specifics are required for reproducibility and to confirm that no new artifacts are introduced. The current description focuses on the high-level architecture; in the revision we will expand the Framework Integration section with concrete mechanisms, such as updating peripheral registers and raising TLM interrupt notifications upon input injection, and using SystemC checkpoint/restore facilities to reset simulation state between iterations while preserving only the necessary peripheral context. These additions will allow readers to evaluate the realism of the side effects. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper describes a framework for integrating AFL++ fuzzing with stateful SystemC-TLM virtual prototypes, where inputs are injected into peripheral models to produce realistic side effects like interrupts. Central claims of eliminating false positives while preserving coverage and performance rest on empirical comparisons to prior tools, not on any derivation that reduces to self-definition, fitted parameters renamed as predictions, or load-bearing self-citations. No equations or uniqueness theorems are invoked in the abstract or summary that presuppose the result; the logic follows directly from avoiding user-mode abstractions when models are accurate, with model fidelity treated as an external assumption rather than an internal tautology. The work is self-contained via experimental validation against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no identifiable free parameters, axioms, or invented entities; the work appears to rely on standard SystemC-TLM modeling assumptions and AFL++ usage.

pith-pipeline@v0.9.0 · 5477 in / 1060 out tokens · 35136 ms · 2026-05-10T04:46:02.448510+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

32 extracted references · 5 canonical work pages

  1. [1]

    Fabrice Bellard. 2005. Qemu, a fast and portable dynamic translator. InProceed- ings of the Annual Conference on USENIX Annual Technical Conference(ATEC ’05). USENIX Association, Anaheim, CA, 41

  2. [2]

    Marcel Böhme, Valentin J. M. Manès, and Sang Kil Cha. 2023. Boosting fuzzer efficiency: an information theoretic perspective.Commun. ACM, 66, 11, (Oct. 2023), 89–97. doi:10.1145/3611019

  3. [3]

    Bosch. [n. d.] MCAN User Manual. (). https://www.bosch-semiconductors.com /media/ip_modules/pdf_2/m_can/mcan_users_manual_v331.pdf

  4. [4]

    Peng Chen and Hao Chen. 2018. Angora: efficient fuzzing by principled search. In2018 IEEE Symposium on Security and Privacy (SP). IEEE, 711–725

  5. [5]

    Clements, Eric Gustafson, Tobias Scharnowski, Paul Grosen, David Fritz, Christopher Kruegel, Giovanni Vigna, Saurabh Bagchi, and Mathias Payer

    Abraham A. Clements, Eric Gustafson, Tobias Scharnowski, Paul Grosen, David Fritz, Christopher Kruegel, Giovanni Vigna, Saurabh Bagchi, and Mathias Payer

  6. [6]

    InProceedings of the 29th USENIX Conference on Security Symposium(SEC’20) Article 68

    Halucinator: firmware re-hosting through abstraction layer emulation. InProceedings of the 29th USENIX Conference on Security Symposium(SEC’20) Article 68. USENIX Association, USA, 18 pages.isbn: 978-1-939133-17-5

  7. [7]

    Bo Feng. 2020. P2im github page. (2020). https://github.com/RiS3-Lab/p2im

  8. [8]

    Bo Feng, Alejandro Mera, and Long Lu. 2020. {P2im}: scalable and hardware- independent firmware testing via automatic peripheral interface modeling. In 29th USENIX Security Symposium (USENIX Security 20), 1237–1254

  9. [9]

    2020.{Afl++}: combining incremental steps of fuzzing research

    Andrea Fioraldi, Dominik Maier, Heiko Eißfeldt, and Marc Heuse. 2020.{Afl++}: combining incremental steps of fuzzing research. In14th USENIX Workshop on Offensive Technologies (WOOT 20)

  10. [10]

    Andrea Fioraldi, Dominik Christian Maier, Dongjia Zhang, and Davide Balzarotti

  11. [11]

    InProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, 1051–1065

    Libafl: a framework to build modular and reusable fuzzers. InProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, 1051–1065

  12. [12]

    Patrice Godefroid, Michael Y Levin, David A Molnar, et al. 2008. Automated whitebox fuzz testing. InNdss. Vol. 8, 151–166

  13. [13]

    Vladimir Herdt, Daniel Große, Jonas Wloka, Tim Güneysu, and Rolf Drechsler

  14. [14]

    InProceedings of the 2020 on Great Lakes Symposium on VLSI(GLSVLSI ’20)

    Verification of embedded binaries using coverage-guided fuzzing with systemc-based virtual prototypes. InProceedings of the 2020 on Great Lakes Symposium on VLSI(GLSVLSI ’20). Association for Computing Machinery, Virtual Event, China, 101–106.isbn: 9781450379441. doi:10.1145/3386263.3406 899

  15. [15]

    Doug Jacobson. 2023. Car thieves can hack into today’s computerized vehicles. (2023). https://www.scientificamerican.com/article/to-steal-todays-compute rized-cars-thieves-go-high-tech

  16. [16]

    MachineWare. 2025. Machineware website. (2025). https://www.machineware .de/

  17. [17]

    MachineWare. [n. d.] VCML. (). https://github.com/machineware-gmbh/vcml

  18. [18]

    Sanoop Mallissery and Yu-Sung Wu. 2023. Demystify the fuzzing methods: a comprehensive survey.ACM Comput. Surv., 56, 3, Article 71, (Oct. 2023), 38 pages. doi:10.1145/3623375

  19. [19]

    Valentin JM Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J Schwartz, and Maverick Woo. 2019. The art, science, and engineering of fuzzing: a survey.IEEE Transactions on Software Engineering, 47, 11, 2312–2331

  20. [20]

    Packetlabs. 2023. The dark art of uart hacking. (2023). https://www.packetlabs .net/posts/the-dark-art-of-uart-hacking

  21. [21]

    Tobias Scharnowski, Nils Bars, Moritz Schloegel, Eric Gustafson, Marius Muench, Giovanni Vigna, Christopher Kruegel, Thorsten Holz, and Ali Abbasi. 2022. Fuzzware: using precise {mmio} modeling for effective firmware fuzzing. In 31st USENIX Security Symposium (USENIX Security 22), 1239–1256

  22. [22]

    Nordic Semiconductor. [n. d.] Nrfx drivers. (). https://github.com/NordicSemic onductor/nrfx

  23. [23]

    SystemC. 2025. Systemc website. (2025). https://systemc.org/

  24. [24]

    Ken Tindell

    Dr. Ken Tindell. 2023. The can injection attack. (2023). https://www.can-cia.or g/fileadmin/cia/documents/publications/cnlm/june_2023/cnlm_23-2_p20_th e_can_injection_attack_ken_tindel_canis_automotive_labs.pdf

  25. [25]

    Zhenkun Yang, Yuriy Viktorov, Jin Yang, Jiewen Yao, and Vincent Zimmer

  26. [26]

    In2020 57th ACM/IEEE Design Automation Conference (DAC), 1–6

    Uefi firmware fuzzing with simics virtual platform. In2020 57th ACM/IEEE Design Automation Conference (DAC), 1–6. doi:10.1109/DAC18072.2020.921869 4

  27. [27]

    Michal Zalewski. 2025. American fuzzy loop website. (2025). https://lcamtuf.co redump.cx/afl/

  28. [28]

    Zephyr. 2025. Babbling zephyr example. (2025). https://github.com/zephyrproj ect-rtos/zephyr/tree/main/samples/drivers/can/babbling

  29. [29]

    Zephyr. 2025. Passthrough zephyr example. (2025). https://github.com/zephyr project-rtos/zephyr/tree/main/samples/drivers/uart/passthrough

  30. [30]

    Zephyr Project. 2024. Zephyr RTOS. https://www.zephyrproject.org/. Accessed: 2025-11-03. (2024)

  31. [31]

    Yaowen Zheng, Ali Davanian, Heng Yin, Chengyu Song, Hongsong Zhu, and Limin Sun. 2019. Firm-afl: high-throughput greybox fuzzing of iot firmware via augmented process emulation. InProceedings of the 28th USENIX Conference on Security Symposium(SEC’19). USENIX Association, Santa Clara, CA, USA, 1099–1114.isbn: 9781939133069

  32. [32]

    Xiaogang Zhu, Sheng Wen, Seyit Camtepe, and Yang Xiang. 2022. Fuzzing: a survey for roadmap.ACM Comput. Surv., 54, 11s, Article 230, (Sept. 2022), 36 pages. doi:10.1145/3512345