pith. sign in

arxiv: 2606.02126 · v1 · pith:OPPHIWTVnew · submitted 2026-06-01 · 💻 cs.CR

PeAR: A Static Binary Rewriting Framework for Binary-Only Fuzzing

Pith reviewed 2026-06-28 14:07 UTC · model grok-4.3

classification 💻 cs.CR
keywords binary-only fuzzingstatic binary instrumentationbinary rewritingfuzzing frameworkcoverage guidancepersistent modedeferred initialization
0
0 comments X

The pith

Accurate static binary instrumentation enables effective binary-only fuzzing without dynamic overhead.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to demonstrate that long-standing doubts about the accuracy and soundness of static binary instrumentation for fuzzing are misplaced. It builds an extensible framework on existing static rewriting tools to support advanced fuzzing modes such as deferred initialization, persistent execution, and shared-memory communication. Evaluation across thousands of CPU-hours on standard benchmarks shows that the approach instruments the large majority of targets, delivers substantial throughput gains, and reaches coverage levels on par with source-level instrumentation. If correct, this means binary-only fuzzing can proceed at higher speed and lower cost when source code is unavailable.

Core claim

PeAR applies static binary rewriting to deliver the instrumentation required for coverage-guided fuzzing, including the implementation of deferred initialization, persistent mode, and shared-memory fuzzing. On the FUZZBENCH suite the resulting system instruments 88 percent of targets, produces a median fourfold throughput increase under persistent and shared-memory operation, and attains code coverage comparable to that obtained from compiler-based instrumentation.

What carries the argument

PeAR, the extensible fuzzing framework that performs complex, high-granularity instrumentation through static binary rewriting.

If this is right

  • Binary-only fuzzing no longer needs to accept the runtime cost of dynamic instrumentation for coverage guidance.
  • Advanced fuzzer features such as persistent mode can be realized statically with negligible performance penalty.
  • Closed-source targets can receive coverage feedback at a scale and speed previously associated only with source-available builds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same static-rewriting base could support other dynamic-analysis tools that currently rely on instrumentation at runtime.
  • Wider use of static methods may shift tooling choices away from dynamic frameworks in security testing pipelines.
  • Replicating the evaluation on a broader set of real-world closed-source binaries would test whether the reported success rate generalizes.

Load-bearing premise

Existing static binary rewriting tools can produce instrumentation accurate enough that any errors do not meaningfully reduce a fuzzer's ability to discover bugs.

What would settle it

On the same set of FUZZBENCH programs, a direct head-to-head run shows the static approach either instruments far fewer targets or yields measurably lower coverage and fewer crashes than a dynamic-instrumentation baseline.

Figures

Figures reproduced from arXiv: 2606.02126 by Adrian Herrera, Alvin Charles, Alwen Tiu, Peter Oslington.

Figure 1
Figure 1. Figure 1: PeAR overview. The target binary is disassembled and statically analyzed before being instrumented and rewritten. 3.1 Static Analysis PeAR uses reassembleable disassembly [22] to achieve efficient static binary instrumentation. We use Ddisasm—a fast and accurate disassembler implemented using Datalog—to disassemble the target binary and translate it to the GTIRB intermediate representation for further anal… view at source ↗
Figure 2
Figure 2. Figure 2: Example of PeAR’s persistent mode instrumenta￾tion. Listing 4 shows the persistent mode handler inserted at foo’s entrypoint. The handler first saves the current register state be￾fore entering the persistent loop. The first time the loop is en￾tered, the caller’s return address (in main) is saved to memory (Sections 3.2.3 to 3.2.3). In the loop body, the start address of the loop replaces the caller’s ret… view at source ↗
Figure 3
Figure 3. Figure 3: Throughput of each fuzzer per benchmark, expressed as a ratio to AFL++ with compiler instrumentation. Higher is [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
read the original abstract

Binary-only fuzzing is a key technique for finding bugs in close-source software. Without access to source code, the fuzzer must rely on static or dynamic binary instrumentation for coverage guidance. In practice, most fuzzers favor dynamic binary instrumentation (DBI), accepting runtime overhead to avoid the perceived accuracy and soundness challenges associated with static binary instrumentation (SBI). We show that these concerns are unwarranted, and that accurate, scalable~SBI is achievable using off-the-shelf frameworks. Building on these frameworks, we develop PeAR, an extensible binary-only fuzzing framework. We demonstrate PeAR's versatility by implementing several modern fuzzer features -- including, deferred initialization, persistent mode, and shared-memory fuzzing. We evaluate PeAR over 4.25 CPU-yrs of fuzzing on the FUZZBENCH benchmark and find that PeAR: (i) successfully instruments 88% of FUZZBENCH targets, comparable to the best SBI-based fuzzers; (ii) achieves a median throughput improvement of 4x when using persistent mode and shared memory fuzzing; and (iii) attains coverage comparable to compiler-based instrumentation. Our results show that SBI is a practical and effective technique for binary-only fuzzing, and that modern binary rewriting frameworks can apply complex instrumentation with high granularity and negligible performance compromise.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents PeAR, a static binary rewriting framework for binary-only fuzzing built on off-the-shelf tools. It claims that accuracy and soundness concerns with static binary instrumentation (SBI) are unwarranted, demonstrates implementation of features such as deferred initialization, persistent mode, and shared-memory fuzzing, and reports evaluation results on FUZZBENCH over 4.25 CPU-years: 88% instrumentation success rate (comparable to best SBI fuzzers), median 4x throughput improvement with persistent/shared-memory modes, and coverage comparable to compiler-based instrumentation.

Significance. If the central empirical claims hold, the work would establish SBI as a practical, lower-overhead alternative to dynamic binary instrumentation for closed-source fuzzing and show that modern static rewriting frameworks can support complex, high-granularity instrumentation reliably. The scale of the FUZZBENCH evaluation (4.25 CPU-years) provides a substantial empirical basis for the practicality argument.

major comments (2)
  1. [Abstract] Abstract and evaluation: the reported 88% instrumentation success, 4x throughput, and 'comparable coverage' rest on aggregate metrics without visible details on target selection criteria, failure modes for the 12% of targets, or statistical controls for variability; this makes it impossible to assess whether post-hoc choices or unrepresentative targets affect the claim that SBI concerns are unwarranted.
  2. [Evaluation] The central claim that static rewriting produces instrumentation whose coverage is sound and complete enough to match compiler-based results (without introducing errors that change fuzzing behavior) requires explicit validation on edge cases such as indirect control flow, PIC, exceptions, or obfuscated code; aggregate 'comparable coverage' metrics alone do not confirm equivalence or rule out systematic disassembly/relocation errors.
minor comments (2)
  1. Clarify the exact definition of 'successfully instruments' (e.g., whether it requires full relocation of all code or only coverage-relevant blocks) and how coverage is measured against compiler instrumentation.
  2. The manuscript would benefit from a short table or paragraph listing the specific off-the-shelf rewriting frameworks used and any custom extensions in PeAR.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point by point below.

read point-by-point responses
  1. Referee: [Abstract] Abstract and evaluation: the reported 88% instrumentation success, 4x throughput, and 'comparable coverage' rest on aggregate metrics without visible details on target selection criteria, failure modes for the 12% of targets, or statistical controls for variability; this makes it impossible to assess whether post-hoc choices or unrepresentative targets affect the claim that SBI concerns are unwarranted.

    Authors: FUZZBENCH is a fixed, community-defined benchmark; our evaluation used every target in the suite with no post-hoc filtering or exclusion. We will add an expanded evaluation subsection that lists per-target success rates, enumerates the concrete failure modes for the 12% (primarily relocation and exception-handling issues in a small number of targets), and details the statistical controls (median over 10 runs per configuration with inter-quartile ranges). revision: yes

  2. Referee: [Evaluation] The central claim that static rewriting produces instrumentation whose coverage is sound and complete enough to match compiler-based results (without introducing errors that change fuzzing behavior) requires explicit validation on edge cases such as indirect control flow, PIC, exceptions, or obfuscated code; aggregate 'comparable coverage' metrics alone do not confirm equivalence or rule out systematic disassembly/relocation errors.

    Authors: The FUZZBENCH programs contain substantial indirect control flow, PIC, and exception handling; coverage parity with compiler instrumentation across this diverse set already supplies evidence against systematic disassembly or relocation errors. We will add a short discussion subsection that explicitly maps these language features to the benchmark targets and notes the absence of observable behavioral divergence. A full synthetic edge-case suite would be a useful addition but exceeds the scope of the current major-revision request; we therefore treat this as a partial revision. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical tool evaluation with no derivations or fitted parameters

full rationale

The paper presents an engineering artifact (PeAR framework) and reports empirical results on instrumentation success rate (88%), throughput (4x median), and coverage comparability on FUZZBENCH. No equations, parameters, or self-citation chains appear in the abstract or described claims; the central assertions rest on direct measurement against external benchmarks rather than any reduction of outputs to inputs by construction. This matches the default expectation for non-circular empirical systems papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical model or derivation; the central claim rests on the empirical performance of an implemented tool rather than axioms or free parameters.

pith-pipeline@v0.9.1-grok · 5773 in / 1043 out tokens · 20195 ms · 2026-06-28T14:07:00.503743+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

24 extracted references · 9 canonical work pages

  1. [1]

    Erick Bauman, Zhiqiang Lin, Kevin W Hamlen, et al. 2018. Superset Disassembly: Statically Rewriting x86 Binaries Without Heuristics. InNetwork and Distributed System Security Symposium (NDSS). The Internet Society

  2. [2]

    Fabrice Bellard. 2005. QEMU, a Fast and Portable Dynamic Translator. InAnnual Technical Conference (ATC). USENIX, 41

  3. [3]

    Ranasinghe

    Michael Chesser, Surya Nepal, and Damith C. Ranasinghe. 2023. Icicle: A Re- Designed Emulator for Grey-Box Firmware Fuzzing. InInternational Symposium on Software Testing and Analysis (ISSTA). ACM, 76–88. doi:10.1145/3597926.3598 039

  4. [4]

    Sushant Dinesh, Nathan Burow, Dongyan Xu, and Mathias Payer. 2020. Retro- write: Statically instrumenting cots binaries for fuzzing and sanitization. In Security and Privacy (S&P). IEEE, 1497–1511. doi:10.1109/SP40000.2020.00009

  5. [5]

    Brendan Dolan-Gavitt, Patrick Hulin, Engin Kirda, Tim Leek, Andrea Mambretti, Wil Robertson, Frederick Ulrich, and Ryan Whelan. 2016. Lava: Large-scale automated vulnerability addition. InSecurity and Privacy (S&P). IEEE, 110–121

  6. [6]

    Gregory J Duck, Xiang Gao, and Abhik Roychoudhury. 2020. Binary Rewriting Without Control Flow Recovery. InProgramming Language Design and Imple- mentation (PLDI). ACM, 151–163. doi:10.1145/3385412.3385972

  7. [7]

    Andrea Fioraldi, Dominik Maier, Heiko Eißfeldt, and Marc Heuse. 2020. {AFL++}: Combining incremental steps of fuzzing research. InWorkshop on Offensive Technologies (WOOT). USENIX

  8. [8]

    Antonio Flores-Montoya and Eric Schulte. 2020. Datalog disassembly. InSecurity Symposium (SEC). USENIX, 1075–1092

  9. [9]

    Xiang Gao, Gregory J Duck, and Abhik Roychoudhury. 2021. Scalable fuzzing of program binaries with E9AFL. InAutomated Software Engineering (ASE). IEEE, 1247–1251. doi:10.1109/ASE51524.2021.9678913

  10. [10]

    GrammaTech. [n. d.]. gtirb-rewriting. https://github.com/GrammaTech/gtirb- rewriting

  11. [11]

    Hawkins, Jason D

    William H. Hawkins, Jason D. Hiser, Michele Co, Anh Nguyen-Tuong, and Jack W. Davidson. 2017. Zipr: Efficient Static Binary Rewriting for Security. InDependable Systems and Networks (DSN). 559–566. doi:10.1109/DSN.2017.27

  12. [12]

    Marc Heuse. 2021. American Fuzzy Lop + Dyninst == AFL Fuzzing blackbox binaries. https://github.com/vanhauser-thc/afl-dyninst

  13. [13]

    Jinho Jung, Stephen Tong, Hong Hu, Jungwon Lim, Yonghwi Jin, and Taesoo Kim. 2021. WINNIE : Fuzzing Windows Applications with Harness Synthesis and Fast Cloning. InNetwork and Distributed System Security Symposium (NDSS). The Internet Society

  14. [14]

    Hyungseok Kim, Soomin Kim, Junoh Lee, Kangkook Jee, and Sang Kil Cha

  15. [15]

    InUSENIX Security Symposium (SEC)

    Reassembly is Hard: A Reflection on Challenges and Strategies. InUSENIX Security Symposium (SEC). USENIX, 1469–1486

  16. [16]

    Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Ge- off Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: building customized program analysis tools with dynamic instrumentation. (2005), 190–200. doi:10.1145/1065010.1065034

  17. [17]

    Jonathan Metzman, László Szekeres, Laurent Simon, Read Sprabery, and Ab- hishek Arya. 2021. Fuzzbench: an open fuzzer benchmarking platform and service. InProceedings of the 29th ACM joint meeting on European software en- gineering conference and symposium on the foundations of software engineering. 1393–1403

  18. [18]

    Hiser, Jack W

    Stefan Nagy, Anh Nguyen-Tuong, Jason D. Hiser, Jack W. Davidson, and Matthew Hicks. 2021. Breaking Through Binaries: Compiler-quality Instrumentation for Better Binary-only Fuzzing. InSecurity Symposium (SEC). USENIX, 1683–1700

  19. [19]

    Paradyn Tools Project. 2018. Dyninst. https://dyninst.org/

  20. [20]

    Soumyakant Priyadarshan, Huan Nguyen, and R. Sekar. 2024. Accurate Disas- sembly of Complex Binaries Without Use of Compiler Metadata. InArchitectural Support for Programming Languages and Operating Systems (ASPLOS). 1–18. doi:10.1145/3623278.3624766

  21. [21]

    Brown, and Vlad Folts

    Eric Schulte, Michael D. Brown, and Vlad Folts. 2022. A Broad Comparative Evaluation of X86-64 Binary Rewriters. InCyber Security Experimentation and Test (CSET). ACM, 129–144. doi:10.1145/3546096.3546112

  22. [22]

    Eric Schulte, Jonathan Dorn, Antonio Flores-Montoya, Aaron Ballman, and Tom Johnson. 2020. GTIRB: intermediate representation for binaries.arXiv preprint arXiv:1907.02859(2020)

  23. [23]

    Shuai Wang, Pei Wang, and Dinghao Wu. 2015. Reassembleable Disassembling. In24th USENIX Security Symposium, USENIX Security 15, Washington, D.C., USA, August 12-14, 2015, Jaeyeon Jung and Thorsten Holz (Eds.). USENIX Association, 627–642. https://www.usenix.org/conference/usenixsecurity15/technical- sessions/presentation/wang-shuai

  24. [24]

    Zhuo Zhang, Wei You, Guanhong Tao, Yousra Aafer, Xuwei Liu, and Xiangyu Zhang. 2021. StochFuzz: Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting. InSecurity and Privacy (S&P). IEEE, 659–676. doi:10.1109/SP40001.2021.00109 8