pith. machine review for the scientific record. sign in

arxiv: 2605.13885 · v1 · submitted 2026-05-11 · 💻 cs.PL · cs.SE

Recognition: no theorem link

Quantitative Symbolic Patch Impact Analysis

Authors on Pith no claims yet

Pith reviewed 2026-05-15 04:57 UTC · model grok-4.3

classification 💻 cs.PL cs.SE
keywords patch impact analysisquantitative equivalencesymbolic analysissoftware patchespartial equivalenceCVE analysisprogram verificationbehavioral divergence
0
0 comments X

The pith

Quantitative partial equivalence analysis measures the fraction of inputs where a patch changes program behavior.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces quantitative partial equivalence analysis to move beyond binary checks that only label programs equivalent or not. For patches, which are expected to alter behavior, the method uses symbolic analysis to find the specific input conditions under which the original and patched versions produce identical or different outputs, then quantifies the share of the input domain affected. A range-based search heuristic supplies an efficient sound lower bound on equivalence for numerical inputs. The approach is demonstrated on 90 CVE patches from Linux, Qemu, and FFmpeg plus Juliet and EqBench datasets, where it also uncovers five mislabeled equivalent pairs and the conditions that expose their divergence.

Core claim

We introduce quantitative partial equivalence analysis, an approach for assessing software patches by quantifying behavioral differences between the original vulnerable code and the patched code. Using symbolic analysis we identify input conditions under which patched and original programs exhibit identical or divergent behaviors, and we refine non-equivalence by measuring the extent of behavioral divergence across the input domain; for numerical domains a range-based search heuristic supplies a sound lower bound on equivalence.

What carries the argument

Quantitative partial equivalence analysis, which identifies input conditions for behavioral divergence via symbolic execution and bounds the affected input fraction with a range-based search heuristic.

If this is right

  • Developers obtain concrete measures of how much of the input space is affected by each patch rather than a yes/no answer.
  • The method locates the precise input conditions under which patched and original code diverge.
  • Experiments on real CVE patches from Linux, Qemu, and FFmpeg demonstrate practical characterization of patch impact.
  • Re-examination of the EqBench benchmark identifies five mislabeled equivalent pairs together with the conditions that expose their divergence.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The quantified divergence measure could be used to rank patches by the size of their behavioral footprint before deployment.
  • The same condition-identification technique might be applied to regression testing to focus test effort on the diverging input regions.
  • If the symbolic conditions are precise enough, they could serve as witnesses to confirm that a patch actually closes a vulnerability for relevant attack inputs.

Load-bearing premise

The range-based search heuristic supplies a sound lower bound on equivalence and symbolic analysis identifies all significant diverging input conditions without omission.

What would settle it

Execution of a program pair that violates the computed lower bound on equivalence or that reveals a substantial diverging input region missed by the symbolic conditions.

Figures

Figures reproduced from arXiv: 2605.13885 by Abdus Satter, Laboni Sarker, Tevfik Bultan.

Figure 1
Figure 1. Figure 1: Patch impact analysis results showing the size of the patch impact surface [PITH_FULL_IMAGE:figures/full_fig_p014_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of Combined and Relational Range-Search approaches: (a) [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Performance comparison of different approaches on equivalence bound: [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Average analysis time for different approaches. [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
read the original abstract

Traditional equivalence checking classifies programs as equivalent or non-equivalent, providing insufficient information for tasks like patch impact analysis where it is expected the patched version of the program to be non-equivalent to the original program. When two program versions are non-equivalent, determining under what conditions they differ and what percentage of inputs are affected remains an open challenge. In this work, we introduce quantitative partial equivalence analysis, an approach for assessing software patches by quantifying behavioral differences between the original (vulnerable) code and the patched code. Using symbolic analysis, we identify input conditions under which patched and original programs exhibit identical or divergent behaviors. Our approach refines non-equivalence by measuring the extent of behavioral divergence across the input domain. For efficient quantitative analysis of numerical domains, we propose a range-based search heuristic that provides a sound lower bound on equivalence. We demonstrate our approach on 90 CVE patches from widely used open-source projects (Linux, Qemu, FFmpeg), as well as on a Juliet Test Suite-based dataset containing programs with CWEs. Our results show that quantitative partial equivalence analysis effectively characterizes and quantifies patch impact. Additionally, experiments on the EqBench benchmark reveal five C program pairs that are mislabeled as equivalent, and we identify the input conditions under which their behaviors diverge.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces quantitative partial equivalence analysis, which employs symbolic analysis to identify input conditions under which original and patched program versions exhibit identical or divergent behavior. It proposes a range-based search heuristic claimed to deliver a sound lower bound on the measure of inputs for which the versions are equivalent, and evaluates the approach on 90 CVE patches from Linux, Qemu, and FFmpeg plus a Juliet-based dataset and the EqBench benchmark, reporting quantitative patch-impact characterizations and identifying five mislabeled equivalent pairs in EqBench.

Significance. If the soundness of the range-based heuristic holds, the work advances patch-impact analysis beyond binary equivalence by supplying quantitative measures of divergence, which can inform vulnerability scoping and patch prioritization. The EqBench findings also provide a concrete service to the equivalence-checking community by surfacing mislabeled benchmarks together with concrete divergence conditions.

major comments (3)
  1. [§4] §4 (Range-based Search Heuristic): The central claim that the heuristic supplies a sound lower bound on equivalence rests on unstated assumptions about input-domain partitioning. The manuscript does not address how interval ranges capture non-convex divergence sets, integer-overflow cases, or programs containing loops and pointers; any such violation would directly undermine the reported percentages of affected inputs and the EqBench mislabeling results.
  2. [§5.2] §5.2 (EqBench Experiments): The identification of five mislabeled equivalent pairs is load-bearing for the claim that the method reveals previously undetected divergences. The paper must supply explicit arguments or additional checks that symbolic path enumeration did not miss relevant divergence conditions due to path explosion or abstraction loss in the presence of C-language features such as pointers and non-linear arithmetic.
  3. [§5.1] §5.1 (CVE Patch Results): The quantitative percentages of affected inputs are presented as evidence that the method effectively characterizes patch impact. Without a precise statement of the assumed input-domain cardinality and how the range-based search partitions it, the numerical results cannot be interpreted or reproduced.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'quantitative partial equivalence analysis' is used without a one-sentence definition, which would help readers immediately grasp the contribution.
  2. [§3] Notation: The equivalence measure (percentage of inputs) is described informally; introducing a compact formal notation early would improve clarity when discussing the lower-bound property.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We appreciate the referee's detailed feedback, which has helped us improve the clarity and rigor of our presentation. We respond to each major comment below and indicate the revisions made to the manuscript.

read point-by-point responses
  1. Referee: [§4] §4 (Range-based Search Heuristic): The central claim that the heuristic supplies a sound lower bound on equivalence rests on unstated assumptions about input-domain partitioning. The manuscript does not address how interval ranges capture non-convex divergence sets, integer-overflow cases, or programs containing loops and pointers; any such violation would directly undermine the reported percentages of affected inputs and the EqBench mislabeling results.

    Authors: We agree that the assumptions underlying the range-based heuristic require explicit statement. The heuristic operates on the numerical input domains identified by symbolic execution and computes interval ranges that under-approximate the equivalence set, thereby providing a sound lower bound even if the actual equivalence set is non-convex. For integer overflows, the symbolic analysis models them explicitly using bit-vector semantics. Regarding loops and pointers, the approach relies on bounded symbolic execution to handle loops and uses pointer analysis to resolve aliases where possible; however, we acknowledge that complete handling of all C features remains challenging. We have revised §4 to include a dedicated subsection detailing these assumptions, the soundness argument for the lower bound, and limitations for non-numerical or complex control-flow cases. revision: yes

  2. Referee: [§5.2] §5.2 (EqBench Experiments): The identification of five mislabeled equivalent pairs is load-bearing for the claim that the method reveals previously undetected divergences. The paper must supply explicit arguments or additional checks that symbolic path enumeration did not miss relevant divergence conditions due to path explosion or abstraction loss in the presence of C-language features such as pointers and non-linear arithmetic.

    Authors: To address this, we have added to §5.2 a discussion of our symbolic execution configuration, including the use of KLEE with a timeout per path and loop bound of 100 iterations to mitigate explosion. For the five mislabeled pairs, we manually verified the divergence conditions by constructing concrete inputs that trigger the reported differences, confirming that the symbolic paths leading to divergence were enumerated. While we cannot exhaustively rule out missed paths in all cases due to the inherent undecidability of path explosion, the concrete counterexamples we provide serve as evidence that divergence exists. We have also included an additional experiment with increased bounds on a subset of EqBench to check for additional divergences. revision: partial

  3. Referee: [§5.1] §5.1 (CVE Patch Results): The quantitative percentages of affected inputs are presented as evidence that the method effectively characterizes patch impact. Without a precise statement of the assumed input-domain cardinality and how the range-based search partitions it, the numerical results cannot be interpreted or reproduced.

    Authors: We have revised §5.1 to explicitly state that we assume 32-bit integer input domains (total cardinality 2^32 for signed integers) unless otherwise specified by the program. The range-based search partitions this domain by recursively splitting intervals and using SMT queries to determine equivalence within each sub-range, reporting the measure of ranges where equivalence holds as a lower bound. This allows reproduction of the percentages as fractions of the assumed domain. revision: yes

Circularity Check

0 steps flagged

No circularity: quantitative analysis extends standard symbolic techniques with independent experimental validation

full rationale

The paper defines quantitative partial equivalence analysis via symbolic identification of input conditions for behavioral divergence and introduces a range-based heuristic explicitly as a sound lower bound on equivalence. No equation or definition reduces the quantitative measure to a fitted parameter, self-referential construction, or load-bearing self-citation. Validation relies on external benchmarks (90 CVE patches, Juliet suite, EqBench) whose results are reported as independent measurements rather than derived from the method's own outputs. The derivation chain therefore remains self-contained against external data.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on abstract only; no explicit free parameters, axioms, or invented entities are described. The method appears to rest on standard symbolic execution assumptions and the soundness claim for the new heuristic.

pith-pipeline@v0.9.0 · 5521 in / 1003 out tokens · 34701 ms · 2026-05-15T04:57:20.154650+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages

  1. [1]

    Ffmpeg.https://github.com/FFmpeg/FFmpeg.git, accessed: 2025-03-10

  2. [2]

    Linux.https://github.com/torvalds/linux.git, accessed: 2025-03-10

  3. [3]

    Qemu.https://github.com/qemu/qemu.git, accessed: 2025-03-10 4.https://github.com/laboni68/PatchImpactAnalysis.git(2026)

  4. [4]

    In: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

    Asad, M., Ganguly, K.K., Sakib, K.: Impact analysis of syntactic and semantic similarities on patch prioritization in automated program repair. In: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). pp. 328–332. IEEE (2019)

  5. [5]

    In: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)

    Asad, M., Ganguly, K.K., Sakib, K.: Impact analysis of syntactic and semantic similarities on patch prioritization in automated program repair. In: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). pp. 328–332 (2019).https://doi.org/10.1109/ICSME.2019.00050

  6. [6]

    Aydin, A., Bang, L., Bultan, T.: Automata-based model counting for string con- straints. pp. 255–272 (07 2015).https://doi.org/10.1007/978-3-319-21690-4_ 15

  7. [7]

    Backes, J., Person, S., Rungta, N., Tkachuk, O.: Regression verification using impact summaries. vol. 7976 (07 2013).https://doi.org/10.1007/978-3-642-39176-7_7

  8. [8]

    Badihi, S., Akinotcho, F., Li, Y., Rubin, J.: Ardiff: scaling program equivalence checking via iterative abstraction and refinement of common code. pp. 13–24 (11 2020).https://doi.org/10.1145/3368089.3409757

  9. [9]

    Badihi, S., Li, Y., Rubin, J.: Eqbench: A dataset of equivalent and non-equivalent program pairs. pp. 610–614 (05 2021).https://doi.org/10.1109/MSR52588.2021. 00084

  10. [10]

    http://www.math.ucdavis.edu/ latte/

    Baldoni, V., Berline, N., Loera, J.D., Dutra, B., Köppe, M., Moreinis, S., Pinto, G., Vergne, M., Wu, J.: Latte integrale v1.7.2. http://www.math.ucdavis.edu/ latte/

  11. [11]

    US Department of Commerce, National Institute of Standards and Technology

    Black, P.E., Black, P.E.: Juliet 1.3 test suite: Changes from 1.2. US Department of Commerce, National Institute of Standards and Technology ... (2018)

  12. [12]

    1 c/c++ and java test suite

    Boland, T., Black, P.E.: Juliet 1. 1 c/c++ and java test suite. Computer45(10), 88–90 (2012)

  13. [13]

    SIGPLAN Not

    Borges, M., Filieri, A., d’Amorim, M., Păsăreanu, C.S., Visser, W.: Compositional solution space quantification for probabilistic software analysis. SIGPLAN Not. 49(6), 123–132 (Jun 2014).https://doi.org/10.1145/2666356.2594329

  14. [14]

    Borges, M., Filieri, A., d’Amorim, M., Păsăreanu, C.: Iterative distribution-aware sampling for probabilistic symbolic execution. pp. 866–877 (08 2015).https://doi. org/10.1145/2786805.2786832

  15. [15]

    Cadar, C., Palikareva, H.: Shadow symbolic execution for better testing of evolving software (05 2014).https://doi.org/10.1145/2591062.2591104

  16. [16]

    In: Proceedings of the Second Asia-Pacific Workshop on Systems

    Chen, H., Mao, Y., Wang, X., Zhou, D., Zeldovich, N., Kaashoek, M.F.: Linux kernel vulnerabilities: State-of-the-art defenses and open problems. In: Proceedings of the Second Asia-Pacific Workshop on Systems. pp. 1–5 (2011)

  17. [17]

    Corporation, T.M.: Cwe category: Numeric errors,https://cwe.mitre.org/data/ definitions/189.html

  18. [18]

    In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

    Filieri, A., Păsăreanu, C.S., Visser, W., Geldenhuys, J.: Statistical symbolic ex- ecution with informed sampling. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. p. 437–448. FSE 2014, Association for Computing Machinery, New York, NY, USA (2014). https://doi.org/10.1145/2635868.2635899 Quantitative ...

  19. [19]

    In: 2015 30th IEEE/ACM International Con- ference on Automated Software Engineering (ASE)

    Filieri, A., Păsăreanu, C.S., Yang, G.: Quantification of software changes through probabilistic symbolic execution (n). In: 2015 30th IEEE/ACM International Con- ference on Automated Software Engineering (ASE). pp. 703–708. IEEE (2015)

  20. [20]

    In: Proceedings of the 2012 International Symposium on Software Testing and Analysis

    Geldenhuys, J., Dwyer, M.B., Visser, W.: Probabilistic symbolic execution. In: Proceedings of the 2012 International Symposium on Software Testing and Analysis. p. 166–176. ISSTA 2012, Association for Computing Machinery, New York, NY, USA (2012).https://doi.org/10.1145/2338965.2336773

  21. [21]

    Partners Universal Multidisciplinary Research Journal1(2), 134–152 (2024)

    George, A.S.: When trust fails: Examining systemic risk in the digital economy from the 2024 crowdstrike outage. Partners Universal Multidisciplinary Research Journal1(2), 134–152 (2024)

  22. [22]

    In: Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis

    Ghanbari, A., Marcus, A.: Patch correctness assessment in automated program repair based on the impact of patches on production and test code. In: Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis. pp. 654–665 (2022)

  23. [23]

    Journal of Systems and Software213, 112037 (03 2024).https://doi.org/10.1016/j.jss.2024.112037

    Glock, J., Pichler, J., Pinzger, M.: Pasda: A partition-based semantic differencing approach with best effort classification of undecided cases. Journal of Systems and Software213, 112037 (03 2024).https://doi.org/10.1016/j.jss.2024.112037

  24. [24]

    In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis

    Jiang, J., Xiong, Y., Zhang, H., Gao, Q., Chen, X.: Shaping program repair space with existing patches and similar code. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. p. 298–309. ISSTA 2018, Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3213846.3213871

  25. [25]

    In: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering

    Jiang, Z., Shi, L., Yang, G., Wang, Q.: Patuntrack: Automated generating patch examples for issue reports without tracked insecure code. In: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering. p. 1–13. ASE ’24, Association for Computing Machinery, New York, NY, USA (2024). https://doi.org/10.1145/3691620.3694982

  26. [26]

    BlackHat, Las Vegas, USA (2014)

    Kallenberg, C., Kovah, X., Butterworth, J., Cornwell, S.: Extreme privilege escala- tion on windows 8/uefi systems. BlackHat, Las Vegas, USA (2014)

  27. [27]

    Microsoft, MSR- TR-2010-119, Tech

    Kawaguchi, M., Lahiri, S.K., Rebelo, H.: Conditional equivalence. Microsoft, MSR- TR-2010-119, Tech. Rep (2010)

  28. [28]

    133–151 (04 2018).https://doi.org/10.1007/978-3-319-89960-2_8

    Kim, S., McCamant, S.: Bit-Vector Model Counting Using Statistical Estimation, pp. 133–151 (04 2018).https://doi.org/10.1007/978-3-319-89960-2_8

  29. [29]

    ACM Transactions on Software Engineering and Methodology (TOSEM)27(3), 1–32 (2018)

    Kuchta, T., Palikareva, H., Cadar, C.: Shadow symbolic execution for testing software patches. ACM Transactions on Software Engineering and Methodology (TOSEM)27(3), 1–32 (2018)

  30. [30]

    Lahiri, S., Hawblitzel, C., Kawaguchi, M., Rebêlo, H.: Symdiff: A language-agnostic semantic diff tool for imperative programs (07 2012).https://doi.org/10.1007/ 978-3-642-31424-7_54

  31. [31]

    ACM SIGPLAN Notices49(06 2014).https://doi.org/10.1145/2594291.2594334

    Le, V., Afshari, M., Su, Z.: Compiler validation via equivalence modulo inputs. ACM SIGPLAN Notices49(06 2014).https://doi.org/10.1145/2594291.2594334

  32. [32]

    Lefdal, J.B., Reisæter, D.W.: Security patch management-an overview of the patching process and its challenges in norwegian businesses (2022)

  33. [33]

    SIGPLAN Not.51(1), 298–312 (Jan 2016).https://doi.org/10.1145/2914770

    Long, F., Rinard, M.: Automatic patch generation by learning correct code. SIGPLAN Not.51(1), 298–312 (Jan 2016).https://doi.org/10.1145/2914770. 2837617

  34. [34]

    ACM SIGPLAN Notices49(06 2014).https://doi.org/10

    Luu, L., Shinde, S., Saxena, P., Demsky, B.: A model counter for constraints over unbounded strings. ACM SIGPLAN Notices49(06 2014).https://doi.org/10. 1145/2594291.2594331 22 L. Sarker et al

  35. [35]

    In: 2020 IEEE Symposium on Security and Privacy (SP), IEEE, pp 791–809, https://doi.org/10.1109/SP40000.2020.00076

    Machiry, A., Redini, N., Camellini, E., Kruegel, C., Vigna, G.: Spider: Enabling fast patch propagation in related software repositories. In: 2020 IEEE Symposium on Security and Privacy (SP). pp. 1562–1579 (2020).https://doi.org/10.1109/ SP40000.2020.00038

  36. [36]

    In: Proceedings of the 38th international conference on software engineering

    Mechtaev, S., Yi, J., Roychoudhury, A.: Angelix: Scalable multiline program patch synthesis via symbolic analysis. In: Proceedings of the 38th international conference on software engineering. pp. 691–701 (2016)

  37. [37]

    Mora, F., Li, Y., Rubin, J., Chechik, M.: Client-specific equivalence checking. pp. 441–451 (09 2018).https://doi.org/10.1145/3238147.3238178

  38. [38]

    In: Computer Safety, Reliability, and Security: 34th International Conference, SAFECOMP 2015, Delft, The Netherlands, September 23-25, 2015, Proceedings 34

    Muntean, P., Kommanapalli, V., Ibing, A., Eckert, C.: Automated generation of buffer overflow quick fixes using symbolic execution and smt. In: Computer Safety, Reliability, and Security: 34th International Conference, SAFECOMP 2015, Delft, The Netherlands, September 23-25, 2015, Proceedings 34. pp. 441–456. Springer (2015)

  39. [39]

    In: 2021 18th NAFOSTED Conference on Information and Computer Science (NICS)

    Ninh, T., Hung Nguyen, V., Shone, N., Babenko, M.: Function exclusion in au- tomated security patch testing using chopped symbolic execution. In: 2021 18th NAFOSTED Conference on Information and Computer Science (NICS). IEEE (2022)

  40. [40]

    Niu,S.,Mo,J.,Zhang,Z.,Lv,Z.:Overviewoflinuxvulnerabilities.In:Proceedingsof the 2nd International Conference on Soft Computing in Information Communication Technology. pp. 225–228. Atlantis Press (2014/05).https://doi.org/10.2991/ scict-14.2014.55

  41. [41]

    https://platform.openai.com/docs/models/gpt-4o (2023), accessed: 2026-01-18

    OpenAI: Gpt-4o model. https://platform.openai.com/docs/models/gpt-4o (2023), accessed: 2026-01-18

  42. [42]

    Palikareva, H., Kuchta, T., Cadar, C.: Shadow of a doubt: Testing for divergences between software versions (05 2016).https://doi.org/10.1145/2884781.2884845

  43. [43]

    In: Proceedings of the 16th ACM SIGSOFT International Sympo- sium on Foundations of Software Engineering

    Person, S., Dwyer, M.B., Elbaum, S., Păsăreanu, C.S.: Differential symbolic execution. In: Proceedings of the 16th ACM SIGSOFT International Sympo- sium on Foundations of Software Engineering. p. 226–237. SIGSOFT ’08/FSE- 16, Association for Computing Machinery, New York, NY, USA (2008).https: //doi.org/10.1145/1453101.1453131

  44. [44]

    In: 2011 IEEE International Conference on Computer Science and Automation Engineering

    Ponomarenko, A., Rubanov, V.: Automatic backward compatibility analysis of software component binary interfaces. In: 2011 IEEE International Conference on Computer Science and Automation Engineering. vol. 3, pp. 167–173 (2011). https://doi.org/10.1109/CSAE.2011.5952657

  45. [45]

    Pulliainen, T.: Linux patch management: Comparison of practical implementations (2016)

  46. [46]

    IEEE Access5, 20777–20784 (2017)

    Qiang,W.,Liao,Y.,Sun,G.,Yang,L.T.,Zou,D.,Jin,H.:Patch-relatedvulnerability detection based on symbolic execution. IEEE Access5, 20777–20784 (2017)

  47. [48]

    In: 24th USENIX Security Symposium (USENIX Security 15)

    Ramos, D.A., Engler, D.:{Under-Constrained} symbolic execution: Correctness checking for real code. In: 24th USENIX Security Symposium (USENIX Security 15). pp. 49–64 (2015)

  48. [49]

    In: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis

    Sarker, L.: Quantitative symbolic similarity analysis. In: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. p. 1549–1551. ISSTA 2023, Association for Computing Machinery, New York, NY, USA (2023).https://doi.org/10.1145/3597926.3605238

  49. [50]

    Sarker, L., Bultan, T.: Quantitative symbolic non-equivalence analysis. pp. 2452– 2453 (10 2024).https://doi.org/10.1145/3691620.3695324 Quantitative Symbolic Patch Impact Analysis 23

  50. [51]

    Empirical Software Engineering 27(6), 151 (2022)

    Sawadogo, A.D., Bissyandé, T.F., Moha, N., Allix, K., Klein, J., Li, L., Le Traon, Y.: Sspcatcher: Learning to catch security patches. Empirical Software Engineering 27(6), 151 (2022)

  51. [52]

    In: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering

    Shafiuzzaman, M., Desai, A., Sarker, L., Bultan, T.: Stase: Static analysis guided symbolic execution for uefi vulnerability signature generation. In: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering. pp. 1783–1794 (2024)

  52. [53]

    Sharma, S., Roy, S., Soos, M., Meel, K.: Ganak: A scalable probabilistic exact model counter. pp. 1169–1176 (08 2019).https://doi.org/10.24963/ijcai.2019/163

  53. [54]

    Shoshitaishvili, Y., Wang, R., Salls, C., Stephens, N., Polino, M., Dutcher, A., Grosen, J., Feng, S., Hauser, C., Kruegel, C., Vigna, G.: Sok: (state of) the art of war: Offensive techniques in binary analysis. pp. 138–157 (05 2016).https: //doi.org/10.1109/SP.2016.17

  54. [55]

    In: 2012 34th international conference on software engineering (ICSE)

    Tian, Y., Lawall, J., Lo, D.: Identifying linux bug fixing patches. In: 2012 34th international conference on software engineering (ICSE). pp. 386–396. IEEE (2012)

  55. [56]

    In: Computer Aided Verification - 29th International Conference, CAV 2017, Heidelberg, Germany, Proceedings, Part II

    Trinh, M.T., Chu, D.H., Jaffar, J.: Model counting for recursively-defined strings. In: Computer Aided Verification - 29th International Conference, CAV 2017, Heidelberg, Germany, Proceedings, Part II. pp. 399–418 (2017)

  56. [57]

    Trostanetski, A., Grumberg, O., Kroening, D.: Modular demand-driven analysis of semantic difference for program versions. pp. 405–427 (08 2017).https://doi. org/10.1007/978-3-319-66706-5_20

  57. [58]

    In: 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

    Wang, X., Sun, K., Batcheller, A., Jajodia, S.: Detecting" 0-day" vulnerability: An empirical study of secret security patch in oss. In: 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). pp. 485–492. IEEE (2019)

  58. [59]

    In: 2020 IEEE Conference on Communications and Network Security (CNS)

    Wang, X., Wang, S., Sun, K., Batcheller, A., Jajodia, S.: A machine learning ap- proach to classify security patches into vulnerability types. In: 2020 IEEE Conference on Communications and Network Security (CNS). pp. 1–9. IEEE (2020)

  59. [60]

    Wang, X., Wang, S., Sun, K., Batcheller, A., Jajodia, S.: A machine learning approach to classify security patches into vulnerability types. pp. 1–9. IEEE Conference on Communications and Network Security (CNS) (06 2020).https: //doi.org/10.1109/CNS48642.2020.9162237

  60. [61]

    IEEE Transactions on Dependable and Secure Computing (2022)

    Wu, B., Liu, S., Feng, R., Xie, X., Siow, J., Lin, S.W.: Enhancing security patch identification by capturing structures in commits. IEEE Transactions on Dependable and Secure Computing (2022)

  61. [62]

    In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

    Xu, Z., Chen, B., Chandramohan, M., Liu, Y., Song, F.: Spain: Security patch analysis for binaries towards understanding the pain and pills. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). pp. 462–472 (2017). https://doi.org/10.1109/ICSE.2017.49

  62. [63]

    ACM Transactions on Software Engineering and Methodology 32(06 2023).https://doi.org/10.1145/3604608

    Yang, S., Xu, Z., Xiao, Y., Tang, W., Liu, Y., Shi, Z., Li, H., Sun, L.: Towards practical binary code similarity detection: Vulnerability verification via patch semantic analysis. ACM Transactions on Software Engineering and Methodology 32(06 2023).https://doi.org/10.1145/3604608

  63. [64]

    In: 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)

    Zhou, J., Pacheco, M., Wan, Z., Xia, X., Lo, D., Wang, Y., Hassan, A.E.: Finding a needle in a haystack: Automated mining of silent vulnerability fixes. In: 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). pp. 705–716. IEEE (2021)

  64. [65]

    ACM Transactions on Software Engineering and Methodology (TOSEM)31(1), 1–27 (2021)

    Zhou, Y., Siow, J.K., Wang, C., Liu, S., Liu, Y.: Spi: Automated identification of security patches via commits. ACM Transactions on Software Engineering and Methodology (TOSEM)31(1), 1–27 (2021)