pith. sign in

arxiv: 2606.28870 · v1 · pith:SIN7APFRnew · submitted 2026-06-27 · 💻 cs.CR · cs.SE

Understanding Binary Code Similarity for Real-World Vulnerability Detection: A Large-Scale Empirical Study

Pith reviewed 2026-06-30 09:50 UTC · model grok-4.3

classification 💻 cs.CR cs.SE
keywords binary code similarity detectionvulnerability detectionfirmware analysisthird-party librariesIoT securityempirical studymean reciprocal rank
0
0 comments X

The pith

Build-aware queries from real binaries raise BCSD mean reciprocal rank from 0.818 to 0.981 for firmware vulnerability detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper conducts a large-scale study of binary code similarity detection across 60,000 firmware images from 200 vendors to assess its effectiveness for identifying real-world vulnerabilities. It evaluates the impact of four factors—vulnerable function versions, search space, function sizes, and compilation toolchains—showing that mismatches with actual build conditions degrade performance. To address these issues, the authors introduce a build-aware query strategy that selects queries from representative real-world binaries and demonstrate a TPL-aware two-stage search that further narrows the space. These changes produce measurable gains in ranking accuracy without requiring new detection models.

Core claim

Analysis of BCSD across diverse real firmware reveals that compilation toolchains and search space cause large performance variations; deriving queries from representative real-world binaries closes the gap and raises mean reciprocal rank from 0.818 to 0.981, while a TPL-aware two-stage search improves MRR by an additional 18.5 percent by restricting the search space.

What carries the argument

The build-aware query strategy, which selects query functions from binaries compiled under conditions matching the target firmware rather than from synthetic or mismatched sources.

If this is right

  • Standard BCSD benchmarks that rely on non-representative queries systematically underestimate field performance.
  • Incorporating knowledge of third-party libraries to limit search space yields consistent accuracy gains across different detection methods.
  • Matching query and target binaries on compilation toolchain and build settings is required for reliable vulnerability ranking.
  • Function size and version differences alone do not explain most observed performance drops once build awareness is added.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same query-selection principle could be tested on other binary analysis tasks such as malware classification or patch identification.
  • Detection pipelines might benefit from explicitly encoding build metadata as an auxiliary input rather than treating it as noise.
  • Future large-scale studies could isolate the contribution of each factor by holding the others fixed in controlled subsets of the firmware corpus.

Load-bearing premise

The collection of 60,000 firmware images from 200 vendors supplies enough variety in vulnerabilities, third-party libraries, and compilation environments to support broad conclusions about BCSD behavior.

What would settle it

Running the same evaluation protocol on a fresh set of firmware images from additional vendors and measuring whether the reported MRR gains remain above 0.95 or fall closer to the baseline of 0.818.

Figures

Figures reproduced from arXiv: 2606.28870 by Chaopeng Dong, Hong Li, Hongsong Zhu, Jie Liu, Jingdong Guo, Siyuan Li, Yimo Ren.

Figure 1
Figure 1. Figure 1: The BCSD pipeline for vulnerability discovery. A vulnerable function (query) is input to the BCSD [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of our study architecture. 2.1 Data Collection and Preprocessing To address the limitations of prior benchmarks, we constructed a large-scale, diverse, and realistic dataset derived entirely from real-world firmware. 2.1.1 Firmware Dataset Construction. We constructed our firmware dataset according to three principles that address C1: • Source discovery and normalization.We collect firmware from o… view at source ↗
Figure 3
Figure 3. Figure 3: Performance on the BinKit benchmark (trained and evaluated on BinKit using its standard split). Bars [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Impact of different versions of vulnerable functions on BCSD performance. [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Non-linear impact of function size on BCSD performance. (a) Long-tailed size distribution in our [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: BCSD performance across vendor-specific OpenSSL builds. Even with version/architecture/optimization [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: compares the Control-Flow Graph (CFG) of OpenSSL’s ASN1_verify function under two build configurations. The default compilation (a) features a distributed error-handling archi￾tecture. In stark contrast, the in-the-wild version (b) is transformed by "High Impact" macros from [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
read the original abstract

Firmware lies at the heart of IoT devices. Its development depends heavily on third-party libraries (TPLs), which greatly accelerate the process but simultaneously introduce associated vulnerabilities. Binary Code Similarity Detection (BCSD) is an effective technique for identifying vulnerabilities in firmware by comparing pairs of code segments. However, existing studies either evaluate their performance only on small-scale datasets or lack diversity in terms of vulnerabilities, TPLs, and firmware. Consequently, a comprehensive understanding of BCSD for real-world vulnerability detection remains absent. To bridge this gap, we conduct a large-scale study of vulnerability detection across 60,000 firmware images from 200 vendors using BCSD. Rather than introducing a novel model, we examine the influence of four key factors -- vulnerable function versions, vulnerability search space, function sizes, and compilation toolchains on BCSD performance. Our results reveal that these factors substantially affect performance, often by wide margins. To address this, we propose a build-aware query strategy that derives queries from representative real-world binaries, effectively closing the gap and raising the mean reciprocal rank (MRR) from 0.818 to 0.981. Furthermore, we demonstrate that a TPL-aware, two-stage search process significantly enhances accuracy, improving MRR by 18.5\% by limiting the search space.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript presents a large-scale empirical study of Binary Code Similarity Detection (BCSD) for vulnerability detection in firmware. It evaluates BCSD performance across 60,000 firmware images from 200 vendors, examining the effects of four factors (vulnerable function versions, search space, function sizes, and compilation toolchains). The authors propose a build-aware query strategy that raises MRR from 0.818 to 0.981 and a TPL-aware two-stage search that improves MRR by 18.5%.

Significance. If the dataset is representative, the work provides useful empirical insights into real-world BCSD limitations and practical mitigation strategies, addressing the diversity shortcomings of prior smaller-scale studies. The scale of the corpus is a clear strength.

major comments (2)
  1. [Abstract] Abstract: the motivation criticizes prior studies for insufficient diversity in vulnerabilities, TPLs, and firmware, yet supplies no quantitative evidence (vendor distribution histograms, architecture coverage, TPL frequency counts, or labeling methodology) that the 60k corpus overcomes those limitations. This directly undermines the generalizability of the reported MRR gains.
  2. [Abstract] Abstract and experimental description: no details are given on baseline BCSD implementations, statistical significance tests, or curation/labeling procedures for the 60k dataset. These omissions make it impossible to assess whether the 0.818→0.981 and +18.5% improvements are robust or artifactual.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important areas for improving clarity and transparency in the abstract and experimental sections. We address each point below and will revise the manuscript to incorporate additional details where feasible.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the motivation criticizes prior studies for insufficient diversity in vulnerabilities, TPLs, and firmware, yet supplies no quantitative evidence (vendor distribution histograms, architecture coverage, TPL frequency counts, or labeling methodology) that the 60k corpus overcomes those limitations. This directly undermines the generalizability of the reported MRR gains.

    Authors: We agree that the abstract, due to length constraints, does not include quantitative summaries of dataset diversity. The full manuscript (Section 3) contains vendor distribution details across 200 vendors, architecture coverage (e.g., ARM, x86, MIPS), TPL frequency counts, and labeling methodology based on CVE matching and binary analysis. To strengthen the motivation and generalizability claims, we will revise the abstract to include concise quantitative evidence, such as the number of unique TPLs and architectures represented. revision: yes

  2. Referee: [Abstract] Abstract and experimental description: no details are given on baseline BCSD implementations, statistical significance tests, or curation/labeling procedures for the 60k dataset. These omissions make it impossible to assess whether the 0.818→0.981 and +18.5% improvements are robust or artifactual.

    Authors: The experimental section describes the BCSD tools and dataset construction at a high level, but we acknowledge that explicit details on baseline implementations (e.g., specific versions of tools like BinDiff or Asm2Vec), statistical significance testing for the MRR improvements, and expanded curation/labeling procedures (e.g., exact CVE-to-binary mapping steps) are not sufficiently elaborated. We will revise the experimental description to add these elements, including any applicable significance tests, to allow better assessment of robustness. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical measurements on external corpus

full rationale

The paper reports an empirical large-scale study measuring BCSD performance factors (vulnerable function versions, search space, function sizes, toolchains) across 60k firmware images and then measures MRR gains from two proposed strategies (build-aware queries, TPL-aware two-stage search). These are direct experimental outcomes on held-out or representative binaries, not derivations, fitted parameters renamed as predictions, or self-citation chains. No equations, ansatzes, or uniqueness theorems appear; the MRR numbers (0.818→0.981, +18.5%) are observed deltas, not forced by construction. The representativeness concern raised by the skeptic is a validity/generalizability issue, not a circularity reduction. The work is self-contained against its own corpus benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is an empirical study that relies on standard domain assumptions about BCSD effectiveness without introducing new free parameters or invented entities.

axioms (1)
  • domain assumption Binary Code Similarity Detection (BCSD) is an effective technique for identifying vulnerabilities in firmware by comparing pairs of code segments.
    Presented as established background in the opening of the abstract.

pith-pipeline@v0.9.1-grok · 5780 in / 1225 out tokens · 45187 ms · 2026-06-30T09:50:25.191003+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

54 extracted references · 29 canonical work pages

  1. [1]

    Nguyen, Kandaraj Piamrat, Guido Marchetto, and Quoc-Viet Pham

    Ons Aouedi, Thai-Hoc Vu, Alessio Sacco, Dinh C. Nguyen, Kandaraj Piamrat, Guido Marchetto, and Quoc-Viet Pham. 2024. A Survey on Intelligent Internet of Things: Applications, Security, Privacy, and Future Directions.IEEE Communications Surveys & Tutorials(2024). doi:10.1109/COMST.2024.3430368

  2. [2]

    BusyBox. 2025. BusyBox: The Swiss Army Knife of Embedded Linux. https://www.busybox.net/

  3. [3]

    Chen, Manuel Egele, Maverick Woo, and David Brumley

    Daming D. Chen, Manuel Egele, Maverick Woo, and David Brumley. 2016. Towards Automated Dynamic Analysis for Linux-based Embedded Firmware. InProceedings of the 23rd Network and Distributed System Security Symposium , Vol. 1, No. 1, Article . Publication date: June 2026. Understanding Binary Code Similarity for Real-World Vulnerability Detection: A Large-S...

  4. [4]

    Andrei Costin, Jonas Zaddach, Aurélien Francillon, and Davide Balzarotti. 2014. A large-scale analysis of the security of embedded firmwares. InProceedings of the 23rd USENIX Conference on Security Symposium(San Diego, CA)(SEC’14). USENIX Association, USA, 95–110

  5. [5]

    Andrei Costin, Apostolis Zarras, and Aurélien Francillon. 2016. Automated Dynamic Firmware Analysis at Scale: A Case Study on Embedded Web Interfaces. InProceedings of the 11th ACM on Asia Conference on Computer and Communications Security(Xi’an, China)(ASIA CCS ’16). Association for Computing Machinery, New York, NY, USA, 437–448. doi:10.1145/2897845.2897900

  6. [6]

    curl. 2025. curl: Command line tool and library for transferring data with URLs. https://curl.se/

  7. [7]

    Yaniv David and Eran Yahav. 2014. Tracelet-based code search in executables. InProceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation(Edinburgh, United Kingdom)(PLDI ’14). Association for Computing Machinery, New York, NY, USA, 349–360. doi:10.1145/2594291.2594343

  8. [8]

    Steven H. H. Ding, Benjamin C. M. Fung, and Philippe Charland. 2019. Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization. In2019 IEEE Symposium on Security and Privacy (SP). 472–489. doi:10.1109/SP.2019.00003

  9. [9]

    Sebastian Eschweiler, Khaled Yakdan, and Elmar Gerhards-Padilla. 2016. discovRE: Efficient Cross-Architecture Identification of Bugs in Binary Code. InNetwork and Distributed System Security Symposium. doi:10.14722/ndss.2016. 23185

  10. [10]

    Bo Feng, Alejandro Mera, and Long Lu. 2020. P2IM: scalable and hardware-independent firmware testing via automatic peripheral interface modeling. InProceedings of the 29th USENIX Conference on Security Symposium (SEC’20). USENIX Association, USA, Article 70, 18 pages. https://www.usenix.org/conference/usenixsecurity20/presentation/feng

  11. [11]

    Qian Feng, Rundong Zhou, Chengcheng Xu, Yao Cheng, Brian Testa, and Heng Yin. 2016. Scalable Graph-based Bug Search for Firmware Images. InProceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security(Vienna, Austria)(CCS ’16). Association for Computing Machinery, New York, NY, USA, 480–491. doi:10. 1145/2976749.2978370

  12. [12]

    Fraunhofer SIT. 2019. FACT – Firmware Analysis and Comparison Tool: Documentation and Comparison Capabilities. https://fact-firmware-analysis.readthedocs.io/. Accessed 2025-09-12

  13. [13]

    FreeType. 2025. FreeType: A Free, High-Quality and Portable Font Engine. https://freetype.org/

  14. [14]

    Jian Gao, Xin Yang, Ying Fu, Yu Jiang, and Jiaguang Sun. 2018. VulSeeker: a semantic learning based vulnerability seeker for cross-platform binary. InProceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering(Montpellier, France)(ASE ’18). Association for Computing Machinery, New York, NY, USA, 896–899. doi:10.1145/3238147.3240480

  15. [15]

    GNU Project. 2025. GNU Binutils. https://www.gnu.org/software/binutils/

  16. [16]

    Google. 2011. BinDiff. https://www.zynamics.com/bindiff.html

  17. [17]

    Irfan Ul Haq and Juan Caballero. 2021. A Survey of Binary Code Similarity.ACM Comput. Surv.54, 3, Article 51 (April 2021), 38 pages. doi:10.1145/3446371

  18. [18]

    Haojie He, Xingwei Lin, Ziang Weng, Ruijie Zhao, Shuitao Gan, Libo Chen, Yuede Ji, Jiashui Wang, and Zhi Xue

  19. [19]

    InProceedings of the 33rd USENIX Security Symposium (USENIX Security 24)

    Code is not Natural Language: Unlock the Power of Semantics-Oriented Graph Representation for Binary Code Similarity Detection. InProceedings of the 33rd USENIX Security Symposium (USENIX Security 24). USENIX Association, Philadelphia, PA, 1759–1776. https://www.usenix.org/conference/usenixsecurity24/presentation/he-haojie

  20. [20]

    2025.Binwalk: Firmware Analysis Tool

    Craig Heffner and ReFirm Labs. 2025.Binwalk: Firmware Analysis Tool. https://github.com/ReFirmLabs/binwalk

  21. [21]

    Grant Hernandez, Dave Jing Tian, Tuba Yavuz, Caroline Trippel, Kevin Butler, et al. 2022. FIRMWIRE: Transparent Dynamic Analysis for Cellular Baseband Firmware. InNetwork and Distributed System Security Symposium (NDSS). https://www.ndss-symposium.org/wp-content/uploads/2022-136-paper.pdf

  22. [22]

    IBM. 2020. A new botnet attack just mozied into town. https://www.ibm.com/think/x-force/botnet-attack-mozi- mozied-into-town

  23. [23]

    IBM. 2024. Firmware vs. software: What’s the difference and why it matters. https://www.ibm.com/think/insights/ firmware-vs-software

  24. [24]

    Lichen Jia, Chenggang Wu, Peihua Zhang, and Zhe Wang. 2024. CodeExtract: Enhancing Binary Code Similarity Detection with Code Extraction Techniques. InProceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems(Copenhagen, Denmark)(LCTES 2024). Association for Computing Machinery, New York, N...

  25. [25]

    Dongkwan Kim, Eunsoo Kim, Sang Kil Cha, Sooel Son, and Yongdae Kim. 2023. Revisiting Binary Code Similarity Analysis Using Interpretable Feature Engineering and Lessons Learned.IEEE Transactions on Software Engineering49, 4 (2023), 1661–1682. doi:10.1109/TSE.2022.3187689

  26. [26]

    Wenqiang Li, Jiameng Shi, Fengjun Li, Jingqiang Lin, Wei Wang, and Le Guan. 2022. 𝜇𝐴𝐹 𝐿: Non-intrusive Feedback- driven Fuzzing for Microcontroller Firmware. In2022 IEEE/ACM 44th International Conference on Software Engineering , Vol. 1, No. 1, Article . Publication date: June 2026. 20 Jingdong Guo, Chaopeng Dong, Yimo Ren, Siyuan Li, Jie Liu, Hong Li, an...

  27. [27]

    Xuezixiang Li, Yu Qu, and Heng Yin. 2021. PalmTree: Learning an Assembly Language Model for Instruction Embedding. InProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (CCS ’21). Association for Computing Machinery, New York, NY, USA, 3236–3251. doi:10.1145/3460120.3484587

  28. [28]

    Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. 2019. Graph Matching Networks for Learning the Similarity of Graph Structured Objects. InProceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 3835–3845. http...

  29. [29]

    libexpat. 2025. Expat XML Parser Library. https://libexpat.github.io/

  30. [30]

    libpng. 2025. libpng: The PNG Reference Library. http://www.libpng.org/pub/png/libpng.html

  31. [31]

    LibTIFF. 2025. LibTIFF: TIFF Library and Utilities. http://www.simplesystems.org/libtiff/

  32. [32]

    Zhenhao Luo, Pengfei Wang, Baosheng Wang, Yong Tang, Wei Xie, Xu Zhou, Danjun Liu, and Kai Lu. 2023. VulHawk: Cross-architecture Vulnerability Detection with Entropy-based Binary Code Search. In30th Annual Network and Distributed System Security Symposium, NDSS 2023, San Diego, California, USA, February 27 - March 3, 2023. The Internet Society. doi:10.147...

  33. [33]

    Luca Massarelli, Giuseppe Antonio Di Luna, Fabio Petroni, Roberto Baldoni, and Leonardo Querzoni. 2019. SAFE: Self-Attentive Function Embeddings for Binary Similarity. InDetection of Intrusions and Malware, and Vulnerability Assessment - 16th International Conference, DIMV A 2019, Gothenburg, Sweden, June 19-20, 2019, Proceedings (Lecture Notes in Compute...

  34. [34]

    Marius Muench, Jan Stijohann, Frank Kargl, Aurélien Francillon, and Davide Balzarotti. 2018. What You Corrupt Is Not What You Crash: Challenges in Fuzzing Embedded Devices. InNetwork and Distributed System Security Symposium (NDSS). doi:10.14722/ndss.2018.23166

  35. [35]

    National Institute of Standards and Technology. 2014. CVE-2014-0160. https://nvd.nist.gov/vuln/detail/cve-2014-0160

  36. [36]

    National Institute of Standards and Technology. 2025. National Vulnerability Database (NVD). https://nvd.nist.gov/

  37. [37]

    OpenSSL. 2025. OpenSSL: Cryptography and SSL/TLS Toolkit. https://www.openssl.org/

  38. [38]

    Kexin Pei, Zhou Xuan, Junfeng Yang, Suman Jana, and Baishakhi Ray. 2020. Trex: Learning Execution Semantics from Micro-Traces for Binary Similarity.arXiv preprint arXiv:2012.08680(2020). doi:10.48550/arXiv.2012.08680

  39. [39]

    Nilo Redini, Aravind Machiry, Ruoyu Wang, Chad Spensky, Andrea Continella, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2020. Karonte: Detecting Insecure Multi-binary Interactions in Embedded Firmware. In 2020 IEEE Symposium on Security and Privacy (SP). 1544–1561. doi:10.1109/SP40000.2020.00036

  40. [40]

    Liting Ruan, Qizhen Xu, Shunzhi Zhu, Xujing Huang, and Xinyang Lin. 2024. A Survey of Binary Code Similarity Detection Techniques.Electronics13, 9 (2024). doi:10.3390/electronics13091715

  41. [41]

    Tobias Scharnowski, Nils Bars, Moritz Schloegel, Eric Gustafson, Marius Muench, Giovanni Vigna, Christopher Kruegel, Thorsten Holz, and Ali Abbasi. 2022. Fuzzware: Using Precise MMIO Modeling for Effective Firmware Fuzzing. In31st USENIX Security Symposium (USENIX Security 22). USENIX Association, Boston, MA, 1239–1256. https://www.usenix.org/conference/u...

  42. [42]

    2024.Internet of Things (IoT) connected devices installed base worldwide from 2019 to 2030

    Statista Research Department. 2024.Internet of Things (IoT) connected devices installed base worldwide from 2019 to 2030. Technical Report. Statista. Available at: https://www.statista.com/statistics/1183457/iot-connected-devices-worldwide/

  43. [43]

    Hao Wang, Zeyu Gao, Chao Zhang, Mingyang Sun, Yuchen Zhou, Han Qiu, and Xi Xiao. 2024. CEBin: A Cost-Effective Framework for Large-Scale Binary Code Similarity Detection. InProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024)(Vienna, Austria)(ISSTA 2024). Association for Computing Machinery, New York, N...

  44. [44]

    Hongru Wang, Chunfang Li, Lingfei Zhang, and Minyong Shi. 2018. Anti-Crawler strategy and distributed crawler based on Hadoop. In2018 IEEE 3rd International Conference on Big Data Analysis (ICBDA). IEEE, 227–231. doi:10.1109/ ICBDA.2018.8367682

  45. [45]

    Hao Wang, Wenjie Qu, Gilad Katz, Wenyu Zhu, Zeyu Gao, Han Qiu, Jianwei Zhuge, and Chao Zhang. 2022. jTrans: jump-aware transformer for binary code similarity detection. InProceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis(Virtual, South Korea)(ISSTA 2022). Association for Computing Machinery, New York, NY, USA, 1–...

  46. [46]

    Haohuang Wen, Zhiqiang Lin, and Yinqian Zhang. 2020. FirmXRay: Detecting Bluetooth Link Layer Vulnerabilities From Bare-Metal Firmware. InProceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security (CCS). ACM, 167–180. doi:10.1145/3372297.3423344

  47. [47]

    Yuhao Wu, Jinwen Wang, Yujie Wang, Shixuan Zhai, Zihan Li, Yi He, Kun Sun, Qi Li, and Ning Zhang. 2024. Your Firmware Has Arrived: A Study of Firmware Update Vulnerabilities. In33rd USENIX Security Symposium (USENIX Security 24). USENIX Association, Philadelphia, PA, 5627–5644. https://www.usenix.org/conference/usenixsecurity24/ presentation/wu-yuhao , Vo...

  48. [48]

    Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, and Dawn Song. 2017. Neural Network-Based Graph Embedding for Cross-Platform Binary Code Similarity Detection. InProceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security(Dallas, Texas, USA)(CCS ’17). Association for Computing Machinery, New York, NY, USA, 363–376. doi:10.114...

  49. [49]

    Shouguo Yang, Long Cheng, Yicheng Zeng, Zhe Lang, Hongsong Zhu, and Zhiqiang Shi. 2021. Asteria: Deep Learning- based AST-Encoding for Cross-platform Binary Code Similarity Detection. In51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2021, Taipei, Taiwan, June 21-24, 2021. IEEE, 224–236. doi:10. 1109/DSN48987.2021.00036

  50. [50]

    Shouguo Yang, Chaopeng Dong, Yang Xiao, Yiran Cheng, Zhiqiang Shi, Zhi Li, and Limin Sun. 2023. Asteria-Pro: Enhancing Deep Learning-based Binary Code Similarity Detection by Incorporating Domain Knowledge.ACM Trans. Softw. Eng. Methodol.33, 1, Article 1 (Nov. 2023), 40 pages. doi:10.1145/3604611

  51. [51]

    Jonas Zaddach, Luca Bruno, Aurélien Francillon, and Davide Balzarotti. 2014. AVATAR: A Framework to Support Dynamic Security Analysis of Embedded Systems’ Firmwares. In21st Annual Network and Distributed System Security Symposium, NDSS 2014, San Diego, California, USA, February 23-26, 2014. The Internet Society. https://doi.org/10. 14722/ndss.2014.23229

  52. [52]

    Binbin Zhao, Shouling Ji, Jiacheng Xu, Yuan Tian, Qiuyang Wei, Qinying Wang, Chenyang Lyu, Xuhong Zhang, Changting Lin, JingZheng Wu, and Raheem Beyah. 2022. A large-scale empirical analysis of the vulnerabilities introduced by third-party components in IoT firmware. InProceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Ana...

  53. [53]

    Yaowen Zheng, Ali Davanian, Heng Yin, Chengyu Song, Hongsong Zhu, and Limin Sun. 2019. FIRM-AFL: High- Throughput Greybox Fuzzing of IoT Firmware via Augmented Process Emulation. In28th USENIX Security Symposium (USENIX Security 19). Santa Clara, CA, 1099–1114. https://www.usenix.org/conference/usenixsecurity19/presentation/ zheng

  54. [54]

    zlib. 2025. zlib: A Massively Spiffy Yet Delicately Unobtrusive Compression Library. https://zlib.net/. , Vol. 1, No. 1, Article . Publication date: June 2026