arxiv: 2605.08419 · v2 · submitted 2026-05-08 · 💻 cs.CR · cs.PL

Recognition: no theorem link

Deterministic Fully-Static Whole-Binary Translation without Heuristics

Hongyu Chen , James McGowan , Michael Franz

Authors on Pith no claims yet

Pith reviewed 2026-05-15 05:39 UTC · model grok-4.3

classification 💻 cs.CR cs.PL

keywords binary translationstatic translationx86-64 to AArch64code discoverydeterministic translationwhole-program translationtile-based generationno heuristics

0 comments

The pith

Elevator translates entire x86-64 executables to AArch64 by enumerating every possible byte interpretation upfront and pruning only invalid paths.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Elevator as a binary translator that converts complete x86-64 programs into AArch64 without debug information, source code, or any assumptions about code layout. It generates separate translations for every feasible way each byte can be read as data, an opcode, or an operand, then removes only those paths that would cause abnormal termination. The translations are assembled from code tiles drawn from a high-level description of the source instruction set architecture, producing deterministic and self-contained output binaries. No runtime component remains in the trusted code base, so the resulting programs can be tested, validated, certified, and signed before deployment. Evaluation on the full SPECint 2006 suite shows performance comparable to or better than QEMU user-mode emulation.

Core claim

Elevator achieves deterministic fully-static whole-binary translation from x86-64 to AArch64 by considering all possible interpretations of every byte in the input executable and producing a separate translation for each feasible one ahead of time. Any byte may be treated as data, an opcode, or an opcode argument, with distinct control-flow paths generated for each viable reading and only paths leading to abnormal termination discarded. Translations are built by composing automatically derived code tiles from a high-level source ISA description, yielding complete, self-contained AArch64 binaries that contain the exact code that will run.

What carries the argument

Exhaustive enumeration of byte interpretations as data, opcode, or operand, combined with ahead-of-time pruning of paths that lead to abnormal termination and tile-based code generation from a high-level ISA description.

If this is right

The translated binary contains the precise code that executes, enabling offline testing, validation, certification, and cryptographic signing.
No heuristics are required to distinguish code from data, removing a common source of translation errors.
The output requires no runtime component, shrinking the trusted computing base compared with emulators or JIT compilers.
Performance reaches parity with or exceeds QEMU user-mode emulation on the SPECint 2006 suite.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same exhaustive-interpretation approach could be applied to other architecture pairs to remove heuristic dependencies in cross-ISA migration.
Code-size expansion may be reduced by sharing common tile sequences across multiple interpretation paths in future implementations.
This technique could support high-assurance certification environments where dynamic code generation is disallowed.
Similar enumeration could strengthen static binary analysis tools by systematically exploring ambiguous regions.

Load-bearing premise

That exhaustive enumeration of byte interpretations followed by pruning of abnormal-termination paths will always capture every valid execution behavior in real binaries without omission or new errors.

What would settle it

Take a binary containing a valid but data-hidden code path that only executes under a specific ambiguous byte interpretation, translate it with Elevator, and check whether the output binary executes that path and produces identical results to the original on the same inputs.

Figures

Figures reproduced from arXiv: 2605.08419 by Hongyu Chen, James McGowan, Michael Franz.

**Figure 1.** Figure 1: Elevator system overview. For each input x86-64 binary, Elevator consults a reusable tile bank, built once offline from hand-written C tiles compiled through LLVM with a custom calling convention, and emits a stand-alone AArch64 executable. 4 Translating the CFG Elevator separates translation into three stages. An offline stage (Section 4.1) expresses x64 instruction semantics as C functions, specializes t… view at source ↗

**Figure 2.** Figure 2: Indirect branch handling and ABI translation at exit boundaries. [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Elevator’s translation time (a) tracks input .text size (b) at Pearson 𝑟 = 0.9993; only 403.gcc and 483.xalancbmk swap between the two orderings. The annotations above each bar report the full ELF binary size, which can be substantially larger than .text when a benchmark bundles static data. 445.gobmk, for instance, ships with a ∼4.5 MB pattern database. Elevator disassembles and retranslates only .text, s… view at source ↗

**Figure 4.** Figure 4: (a) Wall-clock runtime and (b) executed instructions on SPECint 2006 at [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 5.** Figure 5: Microarchitectural behavior of Elevator against native AArch64 on SPECint 2006 at -O2 and -O3: (a) cycles per instruction (CPI); (b) branch-miss rate. Log y-axes absorb the 429.mcf (CPI) and 473.astar (branch-miss) outliers. QEMU and Box64 are omitted because their per-instruction rates are diluted by translator-internal instructions and are not directly comparable on a per-instruction basis. 5.3.3 Where t… view at source ↗

**Figure 6.** Figure 6: Code-size cost of Elevator’s superset translation on SPECint 2006 at -O2 and -O3. (a) Translated .text expansion relative to natively-compiled AArch64. (b) Average x86-64 instruction length measured on each source binary. Elevator emits an AArch64 sequence at every valid source-byte offset of the x86-64 .text, and no post-translation size reduction is applied. This follows from the assumption-free stance t… view at source ↗

read the original abstract

We present Elevator, the first binary translator that statically translates entire x86-64 executables to AArch64 without debug information, source code, or assumptions about code layout. Unlike existing systems, which rely on heuristics or runtime fallbacks to handle code-versus-data decoding errors, Elevator considers all possible interpretations of every byte and produces a separate translation for each feasible one ahead of time. Any byte may be interpreted as data, an opcode, or an opcode argument; we generate separate control flow paths for all interpretations, pruning only those leading to abnormal termination. Translations are built by composing code "tiles" automatically derived from a high-level description of the source ISA, yielding a nimble translation framework. The approach is deterministic and produces complete, self-contained binaries with no runtime component in the trusted code base. The principal cost is substantial code size expansion. The key benefit is that the output is the actual code that will run, enabling testing, validation, certification, and cryptographic signing prior to deployment, reducing risk compared to emulators or JIT compilers. We evaluate Elevator on a diverse corpus of real-world binaries, including the entire SPECint 2006 suite, demonstrating that static full-program binary translation can be both reliable and practical. Elevator achieves performance on par with or better than QEMU's user-mode JIT emulation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Elevator's exhaustive byte-interpretation method is a genuine technical step forward, but its correctness still hinges on an unproven pruning assumption.

read the letter

The main point for you is that Elevator claims the first fully static, heuristic-free translation of complete x86-64 binaries to AArch64 by enumerating every possible decoding of each byte, building separate control-flow paths via automatically derived ISA tiles, and discarding only those paths that terminate abnormally. The output is a single self-contained binary with no runtime component in the trusted base. That construction is new relative to the heuristic or hybrid systems cited in the abstract, and the tile-composition approach looks like a clean way to keep the framework maintainable. The SPECint results showing performance on par with QEMU user-mode are the concrete evidence that the overhead is manageable in practice, even with the expected code-size expansion. Credit where it is due: the paper ships a working system on real binaries without debug info or layout assumptions, which is a non-trivial engineering result. The soft spot is exactly the one the stress-test flagged. Retaining only non-crashing interpretations does not guarantee that the kept paths are the ones the original program intended; x86-64 has plenty of legal but unintended instruction sequences that execute without immediate fault. Nothing in the abstract or described method supplies a machine-checked semantics or a reachability argument that would close this gap, so the determinism is procedural rather than semantic. The evaluation section is referenced but not detailed here, so we cannot yet judge how they measured end-to-end correctness beyond crash avoidance. This is the kind of paper that belongs in a systems-security or binary-analysis reading group. A reader working on certifiable or auditable code pipelines will get immediate value from the construction even if they end up disagreeing with the final correctness claim. It deserves a serious referee: the core idea is fresh enough and the implementation claims are concrete enough that a full review can sort out the evaluation gaps and the pruning soundness question without wasting anyone's time.

Referee Report

2 major / 1 minor

Summary. The manuscript presents Elevator, the first static whole-program binary translator from x86-64 to AArch64 that operates without debug information, source code, or assumptions about code layout. It exhaustively enumerates all possible interpretations of every byte (as data, opcode, or operand), composes separate control-flow paths using automatically derived code tiles from a high-level ISA description, and retains only those paths that do not lead to abnormal termination. The resulting translations are deterministic, self-contained, and free of runtime components in the trusted computing base. Evaluation on the full SPECint 2006 suite reports performance on par with or better than QEMU user-mode emulation, with the principal drawback being code-size expansion.

Significance. If the central correctness claim holds, the work would be a notable advance for binary translation and software assurance: it supplies a fully static, deterministic pipeline that permits pre-deployment testing, validation, certification, and cryptographic signing of the output binary. The tile-based construction from a high-level ISA description is a strength for maintainability and extensibility. The SPECint evaluation provides initial evidence of practicality, though the absence of detailed size metrics and multi-interpretation analysis limits the immediate assessment of deployability.

major comments (2)

[Abstract] Abstract and the pruning description: the claim of deterministic correctness without heuristics rests on the assumption that discarding only paths leading to abnormal termination eliminates all incorrect decodings. x86-64 admits many legal but unintended instruction sequences that execute without immediate fault (e.g., data bytes decoded as arithmetic or control-flow operations); nothing in the construction supplies a machine-checked semantics or reachability proof showing that retained paths coincide with the original program's intended interpretation. This assumption is load-bearing for the title claim and the assertion that the output is 'the actual code that will run.'
[Evaluation] Evaluation section: the SPECint 2006 results are reported without error bars, quantitative code-size blowup figures, or analysis of binaries that retain multiple feasible interpretations after pruning. Without these data it is impossible to verify that the approach scales to real-world executables without introducing subtle semantic errors or prohibitive size overhead, undermining the practicality claim relative to QEMU.

minor comments (1)

The abstract states that translations are 'built by composing code tiles automatically derived from a high-level description of the source ISA' but provides no concrete example of a tile or the derivation process; a small illustrative figure or pseudocode would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript describing Elevator. We address each of the major comments in detail below and outline the revisions we plan to make.

read point-by-point responses

Referee: [Abstract] Abstract and the pruning description: the claim of deterministic correctness without heuristics rests on the assumption that discarding only paths leading to abnormal termination eliminates all incorrect decodings. x86-64 admits many legal but unintended instruction sequences that execute without immediate fault (e.g., data bytes decoded as arithmetic or control-flow operations); nothing in the construction supplies a machine-checked semantics or reachability proof showing that retained paths coincide with the original program's intended interpretation. This assumption is load-bearing for the title claim and the assertion that the output is 'the actual code that will run.'

Authors: We appreciate the referee pointing out the subtlety in our correctness argument. Elevator's design is heuristic-free by exhaustively enumerating all byte interpretations and retaining only those paths that do not lead to abnormal termination in our interpreter. This guarantees that the translated binary will not crash for any retained interpretation, and since the original program executes without abnormal termination, its paths are preserved. However, we agree that without a formal machine-checked semantics, we cannot prove that no spurious non-crashing paths are included. The 'actual code that will run' claim is intended to mean that the output is a complete, static binary containing all feasible executions, enabling pre-deployment analysis. We will revise the abstract, introduction, and conclusion to clarify this point and adjust the title claim if necessary to reflect that determinism applies to the pruned set of interpretations. revision: partial
Referee: [Evaluation] Evaluation section: the SPECint 2006 results are reported without error bars, quantitative code-size blowup figures, or analysis of binaries that retain multiple feasible interpretations after pruning. Without these data it is impossible to verify that the approach scales to real-world executables without introducing subtle semantic errors or prohibitive size overhead, undermining the practicality claim relative to QEMU.

Authors: We agree with the referee that the evaluation would benefit from additional quantitative details. In the revised version, we will augment the Evaluation section with: (1) error bars on performance measurements derived from multiple runs on the SPECint 2006 suite, (2) explicit quantitative figures for code-size expansion (e.g., average and per-benchmark blowup ratios), and (3) an analysis of how many binaries retain multiple interpretations post-pruning, including the associated size and performance overheads. These additions will provide a clearer picture of scalability and allow direct comparison to QEMU. revision: yes

Circularity Check

0 steps flagged

No circularity: new construction via exhaustive enumeration and tile-based translation

full rationale

The paper presents Elevator as a direct construction: it enumerates all byte interpretations, composes translations from automatically derived ISA tiles, and prunes paths solely by the observable criterion of abnormal termination. No parameter is fitted to a data subset and then relabeled as a prediction; no uniqueness theorem or ansatz is imported via self-citation; the central claim (complete static coverage without heuristics) is not defined in terms of its own output. The derivation therefore remains self-contained and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The approach rests on the assumption that all valid interpretations can be enumerated and pruned without loss of correctness, plus the existence of a complete high-level ISA description from which tiles can be derived. No free parameters or invented entities are explicitly introduced in the abstract.

axioms (2)

domain assumption A complete high-level description of the x86-64 ISA exists from which code tiles can be automatically derived.
Invoked when describing the tile composition mechanism.
domain assumption Pruning paths that lead to abnormal termination preserves all correct executions.
Central to the claim that the method produces complete translations.

invented entities (1)

code tiles no independent evidence
purpose: Reusable translation fragments automatically derived from the ISA description.
New construct introduced to build the translation framework.

pith-pipeline@v0.9.0 · 5534 in / 1381 out tokens · 26090 ms · 2026-05-15T05:39:47.058111+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages

[1]

Kemerlis, and Georgios Portokalidis

Ioannis Agadakos, Di Jin, David Williams-King, Vasileios P. Kemerlis, and Georgios Portokalidis. 2019. Nibbler: debloating binary shared libraries. InProceedings of the 35th Annual Computer Security Applications Conference (San Juan, Puerto Rico, USA)(ACSAC ’19). Association for Computing Machinery, New York, NY, USA, 70–83. doi:10.1145/3359789.3359823

work page doi:10.1145/3359789.3359823 2019
[2]

Anil Altinay, Joseph Nash, Taddeus Kroes, Prabhu Rajasekaran, Dixin Zhou, Adrian Dabrowski, David Gens, Yeoul Na, Stijn Volckaert, Cristiano Giuffrida, Herbert Bos, and Michael Franz. 2020. BinRec: dynamic binary lifting and recompilation. InProceedings of the Fifteenth European Conference on Computer Systems(Heraklion, Greece)(EuroSys ’20). Association f...

work page doi:10.1145/3342195.3387550 2020
[3]

Kapil Anand, Matthew Smithson, Khaled Elwazeer, Aparna Kotha, Jim Gruen, Nathan Giles, and Rajeev Barua. 2013. A compiler-level intermediate representation based binary analysis and rewriting system. InProceedings of the 8th ACM European Conference on Computer Systems(Prague, Czech Republic)(EuroSys ’13). Association for Computing Machinery, New York, NY,...

work page doi:10.1145/2465351.2465380 2013
[4]

2025.Glossary: What is Windows on Arm (WoA)?https://www.arm.com/glossary/windows-on-arm (accessed 2025-08-19)

Arm, Inc. 2025.Glossary: What is Windows on Arm (WoA)?https://www.arm.com/glossary/windows-on-arm (accessed 2025-08-19)

work page 2025
[5]

2025.The Armv8.3 Architecture Extension

Arm, Inc. 2025.The Armv8.3 Architecture Extension. https://developer.arm.com/documentation/109697/2025_09/Feature- descriptions/The-Armv8-3-architecture-extension(accessed 2025-12-10)

work page 2025
[6]

Erick Bauman, Zhiqiang Lin, Kevin W Hamlen, et al. 2018. Superset Disassembly: Statically Rewriting x86 Binaries Without Heuristics.. InSymposium on Network and Distributed System Security (NDSS)

work page 2018
[7]

Martin Beck, Koustubha Bhat, Lazar Stričević, Geng Chen, Diogo Behrens, Ming Fu, Viktor Vafeiadis, Haibo Chen, and Hermann Härtig. 2023. AtoMig: Automatically Migrating Millions Lines of Code from TSO to WMM. InProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2(Vancouver,...

work page doi:10.1145/3575693.3579849 2023
[8]

Fabrice Bellard. 2005. QEMU, a Fast and Portable Dynamic Translator. In2005 USENIX Annual Technical Conference (USENIX ATC 05). USENIX Association, Anaheim, CA. https://www.usenix.org/conference/2005-usenix-annual- technical-conference/qemu-fast-and-portable-dynamic-translator

work page 2005
[9]

Derek Bruening, Timothy Garnett, and Saman Amarasinghe. 2003. An infrastructure for adaptive dynamic optimization. InProceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization(San Francisco, California, USA)(CGO ’03). IEEE Computer Society, USA, 265–275. https://dl.acm.org/ doi/10.5555/776261.776290

work page doi:10.5555/776261.776290 2003
[10]

Schwartz

David Brumley, Ivan Jager, Thanassis Avgerinos, and Edward J. Schwartz. 2011. BAP: a binary analysis platform. In Proceedings of the 23rd International Conference on Computer Aided Verification(Snowbird, UT)(CA V’11). Springer- Verlag, Berlin, Heidelberg, 463–469. https://dl.acm.org/doi/10.5555/2032305.2032342

work page doi:10.5555/2032305.2032342 2011
[11]

Changbin Chen, Shu Sugita, Yotaro Nada, Hidetsugu Irie, Shuichi Sakai, and Ryota Shioya. 2025. Biotite: A High- Performance Static Binary Translator using Source-Level Information. InProceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction(Las Vegas, NV, USA)(CC ’25). Association for Computing Machinery, New York, NY, USA, 167...

work page doi:10.1145/3708493.3712693 2025
[12]

Sébastien Chevalier and Box64 Contributors. 2024. Box64: Linux Userspace x86_64 Emulator with a Twist, Targeted at ARM64Hosts. https://github.com/ptitSeb/box64. Accessed: 2026-04-20. 21

work page 2024
[13]

Andrew Cunningham. 2025. Apple details the end of Intel Mac Support and a phaseout for Rosetta 2. https: //arstechnica.com/gadgets/2025/06/apple-details-the-end-of-intel-mac-support-and-a-phaseout-for-rosetta-2/

work page 2025
[14]

Amanieu D’Antras, Cosmin Gorgovan, Jim Garside, and Mikel Luján. 2017. Low overhead dynamic binary translation on ARM.SIGPLAN Not.52, 6 (June 2017), 333–346. doi:10.1145/3140587.3062371

work page doi:10.1145/3140587.3062371 2017
[15]

Chinmay Deshpande, Fabian Parzefall, Felicitas Hetzelt, and Michael Franz. 2024. Polynima: Practical Hybrid Recompilation for Multithreaded Binaries. InProceedings of the Nineteenth European Conference on Computer Systems(Athens, Greece)(EuroSys ’24). Association for Computing Machinery, New York, NY, USA, 1126–1141. doi:10.1145/3627703.3650065

work page doi:10.1145/3627703.3650065 2024
[16]

Alessandro Di Federico, Mathias Payer, and Giovanni Agosta. 2017. rev.ng: a unified binary analysis framework to recover CFGs and function boundaries. InProceedings of the 26th International Conference on Compiler Construction (Austin, TX, USA)(CC 2017). Association for Computing Machinery, New York, NY, USA, 131–141. doi:10.1145/ 3033019.3033028

work page arXiv 2017
[17]

Artem Dinaburg and Andrew Ruef. 2014. McSema: Static Translation of X86 Instructions to LLVM. Presented atREcon 2014(Montreal, Canada)

work page 2014
[18]

Sushant Dinesh, Nathan Burow, Dongyan Xu, and Mathias Payer. 2020. RetroWrite: Statically Instrumenting COTS Binaries for Fuzzing and Sanitization. In2020 IEEE Symposium on Security and Privacy (SP). 1497–1511. doi:10.1109/ SP40000.2020.00009

work page arXiv 2020
[19]

Duck, Xiang Gao, and Abhik Roychoudhury

Gregory J. Duck, Xiang Gao, and Abhik Roychoudhury. 2020. Binary rewriting without control flow recovery. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation(London, UK) (PLDI 2020). Association for Computing Machinery, New York, NY, USA, 151–163. doi:10.1145/3385412.3385972

work page doi:10.1145/3385412.3385972 2020
[20]

Thomas Dullien and Sebastian Porst. 2009. REIL: A platform-independent intermediate representation of disassembled code for static code analysis.Proceeding of CanSecWest(2009)

work page 2009
[21]

Alexis Engelke, Dominik Okwieka, and Martin Schulz. 2021. Efficient LLVM-based dynamic binary translation. In Proceedings of the 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments(Virtual, USA)(VEE 2021). Association for Computing Machinery, New York, NY, USA, 165–171. doi:10.1145/3453933.3454022

work page doi:10.1145/3453933.3454022 2021
[22]

Ding-Yong Hong, Chun-Chen Hsu, Pen-Chung Yew, Jan-Jan Wu, Wei-Chung Hsu, Pangfeng Liu, Chien-Min Wang, and Yeh-Ching Chung. 2012. HQEMU: a multi-threaded and retargetable dynamic binary translator on multicores. In Proceedings of the Tenth International Symposium on Code Generation and Optimization(San Jose, California)(CGO ’12). Association for Computing...

work page doi:10.1145/2259016.2259030 2012
[23]

Nigel Horspool and Nenad Marovac

R. Nigel Horspool and Nenad Marovac. 1980. An approach to the problem of detranslation of computer programs. Comput. J.23, 3 (1980), 223–229

work page 1980
[24]

Alex Iliasov. 2003. Templates-based portable just-in-time compiler.SIGPLAN Not.38, 8 (Aug. 2003), 37–43. doi:10. 1145/944579.944588

work page arXiv 2003
[25]

2017.Did Microsoft Just Manually Patch Their Equation Editor Executable? Why Yes, Yes They Did

Mitja Kolsek. 2017.Did Microsoft Just Manually Patch Their Equation Editor Executable? Why Yes, Yes They Did. (CVE- 2017-11882). https://blog.0patch.com/2017/11/did-microsoft-just-manually-patch-their.html (accessed 2025-08-20)

work page 2017
[26]

Mitja Kolsek and 0patch Team. 2017. Did Microsoft Just Manually Patch Their Equation Editor Executable? Why Yes, Yes They Did. (CVE-2017-11882). https://blog.0patch.com/2017/11/did-microsoft-just-manually-patch-their.html

work page 2017
[27]

Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation(Chicago, IL, USA)(PLDI ’05). Assoc...

work page doi:10.1145/1065010.1065034 2005
[28]

2024.How emulation works on arm

Microsoft. 2024.How emulation works on arm. https://learn.microsoft.com/en-us/windows/arm/apps-on-arm-x86- emulation (accessed 2025-08-19)

work page 2024
[29]

Kenneth Miller, Yonghwi Kwon, Yi Sun, Zhuo Zhang, Xiangyu Zhang, and Zhiqiang Lin. 2019. Probabilistic disassembly. InProceedings of the 41st International Conference on Software Engineering(Montreal, Quebec, Canada)(ICSE ’19). IEEE Press, 1187–1198. doi:10.1109/ICSE.2019.00121

work page doi:10.1109/icse.2019.00121 2019
[30]

Nakagawa

Koh M. Nakagawa. 2021. Project Champollion: Reverse engineering Rosetta 2. https://github.com/FFRI/ ProjectChampollion (accessed 2025-08-19)

work page 2021
[31]

Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavyweight dynamic binary instrumentation. InProceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation(San Diego, California, USA)(PLDI ’07). Association for Computing Machinery, New York, NY, USA, 89–100. doi:10.1145/1250734. 1250746

work page doi:10.1145/1250734 2007
[32]

Maksim Panchenko, Rafael Auler, Bill Nell, and Guilherme Ottoni. 2019. BOLT: a practical binary optimizer for data centers and beyond. InProceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization (Washington, DC, USA)(CGO 2019). IEEE Press, 2–14. https://dl.acm.org/doi/10.5555/3314872.3314876

work page doi:10.5555/3314872.3314876 2019
[33]

Chengbin Pang, Ruotong Yu, Yaohui Chen, Eric Koskinen, Georgios Portokalidis, Bing Mao, and Jun Xu. 2020. SoK: All You Ever Wanted to Know About x86/x64 Binary Disassembly But Were Afraid to Ask. arXiv:2007.14266 [cs.CR] 22 https://arxiv.org/abs/2007.14266

work page arXiv 2020
[34]

Fabian Parzefall, Chinmay Deshpande, Felicitas Hetzelt, and Michael Franz. 2024. What You Trace is What You Get: Dynamic Stack-Layout Recovery for Binary Recompilation. InProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2(La Jolla, CA, USA)(ASPLOS ’24). Association for Co...

work page doi:10.1145/3620665.3640371 2024
[35]

Ian Piumarta and Fabio Riccardi. 1998. Optimizing direct threaded code by selective inlining.SIGPLAN Not.33, 5 (May 1998), 291–300. doi:10.1145/277652.277743

work page doi:10.1145/277652.277743 1998
[36]

Chenxiong Qian, Hong Hu, Mansour Alharthi, Pak Ho Chung, Taesoo Kim, and Wenke Lee. 2019. RAZOR: A Framework for Post-deployment Software Debloating. In28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 1733–1750. https://www.usenix.org/conference/usenixsecurity19/presentation/ qian

work page 2019
[37]

Hovav Shacham. 2007. The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86). InProceedings of the 14th ACM Conference on Computer and Communications Security(Alexandria, Virginia, USA) (CCS ’07). Association for Computing Machinery, New York, NY, USA, 552–561. doi:10.1145/1315245.1315313

work page doi:10.1145/1315245.1315313 2007
[38]

Bor-Yeh Shen, Jiunn-Yeu Chen, Wei-Chung Hsu, and Wuu Yang. 2012. LLBT: an LLVM-based static binary translator. In Proceedings of the 2012 International Conference on Compilers, Architectures and Synthesis for Embedded Systems(Tampere, Finland)(CASES ’12). Association for Computing Machinery, New York, NY, USA, 51–60. doi:10.1145/2380403.2380419

work page doi:10.1145/2380403.2380419 2012
[39]

Bor-Yeh Shen, Wei-Chung Hsu, and Wuu Yang. 2014. A Retargetable Static Binary Translator for the ARM Architecture. ACM Trans. Archit. Code Optim.11, 2, Article 18 (June 2014), 25 pages. doi:10.1145/2629335

work page doi:10.1145/2629335 2014
[40]

A. M. Turing. 1937. On Computable Numbers, with an Application to the Entscheidungsproblem.Proceedings of the London Mathematical Societys2-42, 1 (1937), 230–265. doi:10.1112/plms/s2-42.1.230

work page doi:10.1112/plms/s2-42.1.230 1937
[41]

Valgrind Project. 2024. Vex IR. https://github.com/smparkes/valgrind-vex/blob/master/pub/libvex_ir.h (accessed 2025-08-19)

work page 2024
[42]

Matthias Wenzl, Georg Merzdovnik, Johanna Ullrich, and Edgar Weippl. 2019. From Hack to Elaborate Technique– A Survey on Binary Rewriting.ACM Computing Surveys (CSUR)52, 3, Article 49 (June 2019), 37 pages. https: //doi.org/10.1145/3316415

work page doi:10.1145/3316415 2019
[43]

Kemerlis

David Williams-King, Hidenori Kobayashi, Kent Williams-King, Graham Patterson, Frank Spano, Yu Jian Wu, Junfeng Yang, and Vasileios P. Kemerlis. 2020. Egalito: Layout-Agnostic Binary Recompilation. InProceedings of the Twenty- Fifth International Conference on Architectural Support for Programming Languages and Operating Systems(Lausanne, Switzerland)(ASP...

work page doi:10.1145/3373376 2020
[44]

Van De Vanter, Mick Jordan, Laurent Daynès, and Douglas Simon

Christian Wimmer, Michael Haupt, Michael L. Van De Vanter, Mick Jordan, Laurent Daynès, and Douglas Simon. 2013. Maxine: An approachable virtual machine for, and in, java.ACM Trans. Archit. Code Optim.9, 4, Article 30 (Jan. 2013), 24 pages. doi:10.1145/2400682.2400689

work page doi:10.1145/2400682.2400689 2013
[45]

Haoran Xu and Fredrik Kjolstad. 2021. Copy-and-patch compilation: a fast compilation algorithm for high-level languages and bytecode.Proc. ACM Program. Lang.5, OOPSLA, Article 136 (Oct. 2021), 30 pages. doi:10.1145/3485513

work page doi:10.1145/3485513 2021
[46]

Bharadwaj Yadavalli and Aaron Smith

S. Bharadwaj Yadavalli and Aaron Smith. 2019. Raising Binaries to LLVM IR with MCTOLL (WIP Paper). InProceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (Phoenix, AZ, USA)(LCTES 2019). Association for Computing Machinery, New York, NY, USA, 213–218. doi:10.1145/ 3316482.3326354 23

work page arXiv 2019