arxiv: 2604.17977 · v1 · submitted 2026-04-20 · 💻 cs.SE

Recognition: unknown

MASFuzzer: Fuzz Driver Generation and Adaptive Scheduling via Multidimensional API Sequences

Xingyu Liu , Zengqin Huang , Xiang Gao , Hailong Sun

Authors on Pith no claims yet

Pith reviewed 2026-05-10 05:06 UTC · model grok-4.3

classification 💻 cs.SE

keywords fuzz driver generationAPI sequence mininglarge language modelslibrary fuzzingcode coveragevulnerability discoveryadaptive schedulingsoftware testing

0 comments

The pith

MASFuzzer builds multidimensional API sequences from code examples to let LLMs generate fuzz drivers that reach deeper library code.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents MASFuzzer as a way to automate fuzz driver creation for software libraries. It extracts and mines multidimensional API sequences from usage examples in the source code through mutation-propagation and semantic-aware techniques. These sequences prompt large language models to produce initial drivers. A coverage-guided scheduler then allocates testing time to the most effective drivers while a mutation strategy evolves them further. This process targets previously untested branches in complex library behaviors.

Core claim

MASFuzzer synthesizes context-relevant API call sequences by referring to API usage examples from the codebase and applying mutation-propagation-based and semantic-aware API sequence mining. These multidimensional API sequences serve as the basis for LLMs to generate effective initial drivers. In addition, MASFuzzer incorporates a coverage-guided scheduler that prioritizes testing time for the most promising drivers, along with a driver mutation strategy to evolve them. This enables systematic generation of fuzz drivers to explore previously untested code regions.

What carries the argument

multidimensional API sequences mined from codebase usage examples via mutation-propagation and semantic-aware methods, which prompt LLMs for driver generation and guide adaptive scheduling

If this is right

The generated drivers achieve 8.54 percent higher code coverage than state-of-the-art techniques across the evaluated libraries.
The approach uncovers 16 previously unknown vulnerabilities in well-tested libraries, with 14 confirmed by developers.
Nine of the discovered issues receive CVE identifiers.
Adaptive scheduling and driver mutation allow systematic expansion into untested code regions over time.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The mining techniques might transfer to generating test cases for other structured interfaces such as REST APIs or device drivers.
If mining quality varies across codebases with sparse usage examples, the coverage gains would likely diminish.
Pairing the sequence construction with static analysis could strengthen the relevance of sequences for specific vulnerability classes.

Load-bearing premise

API usage examples extracted from the codebase and mined via mutation-propagation and semantic methods will reliably produce context-relevant sequences that enable LLMs to generate drivers reaching deep, previously untested branches.

What would settle it

An independent run of MASFuzzer against the same 12 libraries that yields code coverage no higher than existing tools or zero new vulnerabilities would falsify the performance claims.

Figures

Figures reproduced from arXiv: 2604.17977 by Hailong Sun, Xiang Gao, Xingyu Liu, Zengqin Huang.

**Figure 2.** Figure 2: Overall workflow of MASFUZZER. B. Multidimensional API Sequences Multidimensional API sequences model valid API usage orders, providing the domain-specific insights needed for fuzz driver generation. The process begins with a syntactic analysis of the target library to extract function-level metainfo— including signatures, bodies, file names, line numbers, and documentation comments. This metadata forms t… view at source ↗

**Figure 3.** Figure 3: API sequence extracted from usage example. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: API sequence from semantic analysis [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Ablation results of MASFUZZER with different dimensions of API sequences. more, lacking the driver-level time scheduling and mutation strategies inherent to MASFUZZER, PROMEFUZZ is unable to further exploit API relationships to optimize exploration after generation. Instead, it relies on the random scheduling of the underlying fuzzer, rendering the effectiveness of its RAG-generated fuzz drivers heavily d… view at source ↗

**Figure 5.** Figure 5: Overall Results. Across all target libraries, MASFUZZER using all dimensions of API sequences achieves the highest coverage. The variant using API sequences from both usage examples and mutation-propagation analysis ranked second, with an average coverage 5.17% less than MASFUZZER. In contrast, the variant relying solely on API sequences from usage examples performed the worst, with an average coverage 11.… view at source ↗

**Figure 6.** Figure 6: Vulnerable code of a CVE-assigned bug in [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

read the original abstract

Fuzz testing of software libraries relies on fuzz drivers to invoke library APIs. Traditionally, these drivers are written manually by developers - a process that is time-consuming and often inadequate for exercising complex program behaviors. While recent studies have explored the use of Large Language Models (LLMs) to automate fuzz driver generation, the resulting drivers often fail to cover deep program branches. To address these challenges, we propose MASFUZZER, a fuzzing framework that integrates multidimensional API sequence construction with adaptive fuzzing scheduling strategies to improve library testing. At its core, MASFUZZER synthesizes context-relevant API call sequences by referring to API usage examples from the codebase and applying mutation-propagation-based and semantic-aware API sequence mining. These multidimensional API sequences serve as the basis for LLMs to generate effective initial drivers. In addition, MASFUZZER incorporates a coverage-guided scheduler that prioritizes testing time for the most promising drivers, along with a driver mutation strategy to evolve them. This enables systematic generation of fuzz drivers to explore previously untested code regions. We evaluate MASFUZZER on 12 widely used open-source libraries. The results show that MASFUZZER achieves 8.54 percent higher code coverage than state-of-the-art techniques. Moreover, MASFUZZER uncovers 16 previously unknown vulnerabilities in extensively tested libraries, with 14 confirmed by developers and 9 assigned CVE identifiers. These results indicate that MASFUZZER provides an efficient and practical approach for fuzzing software libraries.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MASFuzzer mines multidimensional API sequences to bootstrap LLM fuzz drivers and adds adaptive scheduling, reporting coverage gains and new bugs, but the evaluation does not isolate whether the sequences actually drive the improvements.

read the letter

Hi, the main thing here is that MASFuzzer builds context-rich API sequences from codebase examples via mutation-propagation and semantic mining, feeds them to an LLM to create initial fuzz drivers, then uses coverage-guided scheduling plus driver mutation to evolve them. It claims 8.54 percent higher coverage than prior techniques plus 16 new vulnerabilities in 12 libraries, with most confirmed and several CVEs assigned. That pipeline for sequence synthesis to help the LLM reach deeper branches is the part presented as new. The practical payoff is real: finding confirmed issues in extensively tested open-source libraries shows the end-to-end system can exercise more code than the baselines it compares against. The scheduler and mutation steps are familiar but applied sensibly to the generated drivers. The soft spot is the missing isolation. Nothing in the description separates the contribution of the mined sequences from the scheduler and mutation alone; without an ablation that swaps the sequences for random or single-API versions while holding everything else fixed, it is possible the adaptive parts explain most of the measured lift. The abstract also skips baseline selection criteria, run counts for statistical checks, and precise coverage measurement details, which leaves the quantitative claims harder to evaluate. If the full paper supplies those controls it would help a lot. This is aimed at researchers and engineers working on automated library testing and fuzzing. Someone looking for concrete ideas on LLM-assisted driver generation plus real bug data would find it worth reading. It has enough empirical grounding on actual code to deserve peer review, though the referees will need to press on the experimental design and component contributions.

Referee Report

2 major / 1 minor

Summary. The paper proposes MASFUZZER, a fuzzing framework that synthesizes context-relevant multidimensional API sequences from codebase usage examples via mutation-propagation and semantic-aware mining; these sequences guide LLMs to generate initial fuzz drivers, which are then evolved via a coverage-guided adaptive scheduler and driver mutation strategy. Evaluated on 12 open-source libraries, it claims 8.54% higher code coverage than state-of-the-art techniques and the discovery of 16 previously unknown vulnerabilities (14 developer-confirmed, 9 with CVEs).

Significance. If the empirical results hold under rigorous validation, the work would meaningfully advance automated library fuzzing by addressing the driver-generation bottleneck and enabling deeper branch coverage without extensive manual effort. The real-world bug findings in mature libraries add practical value, and the integration of mined sequences with adaptive scheduling offers a promising direction for LLM-assisted testing.

major comments (2)

[Evaluation section] Evaluation section: The headline claim attributes the 8.54% coverage gain and 16 new vulnerabilities specifically to the multidimensional API sequence construction (via mutation-propagation and semantic mining). However, no ablation is reported that holds the coverage-guided scheduler and driver mutation fixed while replacing the mined sequences with random or single-API baselines. Without this isolation, it is impossible to establish that the sequence-mining component is load-bearing for the central claims rather than the scheduler alone driving the gains.
[Evaluation section] Evaluation section: The reported quantitative improvements lack supporting details on baseline selection criteria, library sampling method, exact coverage measurement (e.g., line/branch coverage tool and granularity), or statistical significance testing for the 8.54% difference. These omissions prevent independent assessment of whether the results support the stated superiority over state-of-the-art techniques.

minor comments (1)

[Abstract] Abstract: The phrase 'multidimensional API sequences' is used without a concise definition or illustrative example, which would aid reader comprehension before the detailed description in later sections.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on the evaluation section. We address each major comment point by point below and will revise the manuscript to strengthen the presentation of our results.

read point-by-point responses

Referee: [Evaluation section] Evaluation section: The headline claim attributes the 8.54% coverage gain and 16 new vulnerabilities specifically to the multidimensional API sequence construction (via mutation-propagation and semantic mining). However, no ablation is reported that holds the coverage-guided scheduler and driver mutation fixed while replacing the mined sequences with random or single-API baselines. Without this isolation, it is impossible to establish that the sequence-mining component is load-bearing for the central claims rather than the scheduler alone driving the gains.

Authors: We agree that an ablation isolating the multidimensional API sequence construction would provide stronger evidence for its specific contribution. While our main evaluation compares against state-of-the-art techniques that lack this component, we acknowledge the need for an internal controlled comparison. In the revised manuscript, we will add an ablation study that keeps the coverage-guided scheduler and driver mutation strategy fixed and varies only the initial driver generation source (mined multidimensional sequences versus random sequences and single-API baselines). This will quantify the incremental benefit attributable to the sequence-mining step. revision: yes
Referee: [Evaluation section] Evaluation section: The reported quantitative improvements lack supporting details on baseline selection criteria, library sampling method, exact coverage measurement (e.g., line/branch coverage tool and granularity), or statistical significance testing for the 8.54% difference. These omissions prevent independent assessment of whether the results support the stated superiority over state-of-the-art techniques.

Authors: We agree that these methodological details are essential for reproducibility and rigorous assessment. In the revised evaluation section, we will explicitly describe the baseline selection criteria (focusing on recent LLM-assisted and sequence-based fuzzing methods), the library sampling approach (12 diverse, widely-used open-source libraries chosen for domain variety and maturity), the coverage measurement process (including the specific tool and whether line/branch coverage is reported at function or statement granularity), and the statistical significance testing applied to the coverage differences (e.g., paired statistical tests with p-values). revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical system evaluation

full rationale

The paper presents MASFUZZER as an engineering framework that mines multidimensional API sequences from codebases, feeds them to LLMs for initial driver generation, and applies coverage-guided scheduling plus mutation. All headline results (8.54% coverage lift, 16 new bugs) are obtained by running the complete system on 12 external open-source libraries and comparing against published baselines. No equations, fitted parameters, uniqueness theorems, or self-citation chains appear in the derivation; the evaluation is externally falsifiable and does not reduce any claimed outcome to a re-labeling of its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The approach rests on standard assumptions from fuzzing literature that code coverage correlates with bug-finding power and that LLMs can translate API sequences into executable drivers; no free parameters, axioms, or invented entities are explicitly introduced in the abstract.

pith-pipeline@v0.9.0 · 5581 in / 1192 out tokens · 39905 ms · 2026-05-10T05:06:25.693795+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

47 extracted references · 10 canonical work pages · 3 internal anchors

[1]

Afl++: Combining incremental steps of fuzzing research,

A. Fioraldi, D. Maier, H. Eißfeldt, and M. Heuse, “Afl++: Combining incremental steps of fuzzing research,” in 14th USENIX Workshop on Offensive Technologies (WOOT 20) . USENIX Association, 2020. [Online]. Available: https://www.usenix.org/conference/woot20/present ation/fioraldi

2020
[2]

Continuous fuzzing with libfuzzer and addresssanitizer,

K. Serebryany, “Continuous fuzzing with libfuzzer and addresssanitizer,” in IEEE Cybersecurity Development Conference (SecDev) , 2016

2016
[3]

Boosting fuzzer efficiency: An information theoretic perspective,

M. B ¨ohme, V . J. M. Man`es, and S. K. Cha, “Boosting fuzzer efficiency: An information theoretic perspective,” in ACM Joint Meeting on Foun- dations of Software Engineering (FSE) . ACM, 2020

2020
[4]

The art, science, and engineering of fuzzing: A survey,

V . J. Manes, H. Han, C. Han, S. K. Cha, M. Egele, E. J. Schwartz, and M. Woo, “The art, science, and engineering of fuzzing: A survey,” arXiv preprint arXiv:1812.00140, 2018

work page arXiv 2018
[5]

Vulnerability detection through machine learning-based fuzzing: A systematic review,

S. B. Chafjiri, P. Legg, J. Hong, and M.-A. Tsompanas, “Vulnerability detection through machine learning-based fuzzing: A systematic review,” Computers & Security , vol. 143, p. 103903, 2024

2024
[6]

{FuzzGen}: Auto- matic fuzzer generation,

K. Ispoglou, D. Austin, V . Mohan, and M. Payer, “ {FuzzGen}: Auto- matic fuzzer generation,” in 29th USENIX Security Symposium (USENIX Security 20), 2020, pp. 2271–2287. 13

2020
[7]

Afgen: Whole-function fuzzing for applications and libraries,

Y . Liu, Y . Wang, X. Jia, Z. Zhang, and P. Su, “Afgen: Whole-function fuzzing for applications and libraries,” in 2024 IEEE Symposium on Security and Privacy (SP) . IEEE, 2024, pp. 1901–1919

2024
[8]

OSS-Fuzz: Google’s continuous fuzzing service for open source software,

K. Serebryany, “OSS-Fuzz: Google’s continuous fuzzing service for open source software,” in USENIX Security Symposium , 2017

2017
[9]

Llm4tdg: test- driven generation of large language models based on enhanced constraint reasoning,

J. Liu, R. Liang, X. Zhu, Y . Zhang, Y . Liu, and Q. Liu, “Llm4tdg: test- driven generation of large language models based on enhanced constraint reasoning,” Cybersecurity, vol. 8, no. 1, p. 32, 2025

2025
[10]

Utopia: Automatic generation of fuzz driver using unit tests,

B. Jeong, J. Jang, H. Yi, J. Moon, J. Kim, I. Jeon, T. Kim, W. Shim, and Y . H. Hwang, “Utopia: Automatic generation of fuzz driver using unit tests,” in 2023 IEEE Symposium on Security and Privacy (SP) . IEEE, 2023, pp. 2676–2692

2023
[11]

Fudge: fuzz driver generation at scale,

D. Babi ´c, S. Bucur, Y . Chen, F. Ivan ˇci´c, T. King, M. Kusano, C. Lemieux, L. Szekeres, and W. Wang, “Fudge: fuzz driver generation at scale,” in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering , 2019, pp. 975–985

2019
[12]

Graphfuzz: Library api fuzzing with lifetime-aware dataflow graphs,

H. Green and T. Avgerinos, “Graphfuzz: Library api fuzzing with lifetime-aware dataflow graphs,” inProceedings of the 44th International Conference on Software Engineering , 2022, pp. 1070–1081

2022
[13]

Demystify the fuzzing methods: A compre- hensive survey,

S. Mallissery and Y .-S. Wu, “Demystify the fuzzing methods: A compre- hensive survey,”ACM Computing Surveys, vol. 56, no. 3, pp. 71:1–71:38, 2023

2023
[14]

OSS-Fuzz-Gen: Automated Fuzz Target Generation,

D. Liu, O. Chang, J. metzman, M. Sablotny, and M. Maruseac, “OSS-Fuzz-Gen: Automated Fuzz Target Generation,” May 2024. [Online]. Available: https://github.com/google/oss-fuzz-gen

2024
[15]

Prompt fuzzing for fuzz driver generation,

Y . Lyu, Y . Xie, P. Chen, and H. Chen, “Prompt fuzzing for fuzz driver generation,” in Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security , 2024, pp. 3793–3807

2024
[16]

Ckgfuzzer: Llm-based fuzz driver generation enhanced by code knowledge graph,

H. Xu, W. Ma, T. Zhou, Y . Zhao, K. Chen, Q. Hu, Y . Liu, and H. Wang, “Ckgfuzzer: Llm-based fuzz driver generation enhanced by code knowledge graph,” in 2025 IEEE/ACM 47th International Conference on Software Engineering: Companion Proceedings (ICSE- Companion). IEEE, 2025, pp. 243–254

2025
[17]

Promefuzz: A knowledge-driven approach to fuzzing harness generation with large language models,

Y . Liu, J. Deng, X. Jia, Y . Wang, M. Wang, L. Huang, T. Wei, and P. Su, “Promefuzz: A knowledge-driven approach to fuzzing harness generation with large language models,” in Proceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’25. New York, NY , USA: Association for Computing Machinery, 2025, p. 1559–1573. [On...

work page doi:10.1145/3719027.3765222 2025
[18]

Liberating libraries through automated fuzz driver generation: Striking a balance without consumer code,

F. Toffalini, N. Badoux, Z. Tsinadze, and M. Payer, “Liberating libraries through automated fuzz driver generation: Striking a balance without consumer code,” Proceedings of the ACM on Software Engineering , vol. 2, no. FSE, pp. 2123–2145, 2025

2025
[19]

Fuzzing: Challenges and reflections,

M. B ¨ohme, C. Cadar, and A. Roychoudhury, “Fuzzing: Challenges and reflections,” IEEE Software, vol. 38, no. 3, pp. 79–86, 2020

2020
[20]

A systematic review of fuzzing,

X. Zhao, H. Qu, J. Xu, X. Li, W. Lv, and G.-G. Wang, “A systematic review of fuzzing,”Soft Computing, vol. 28, no. 6, pp. 5493–5522, 2024

2024
[21]

Dissecting american fuzzy lop: a fuzzbench evaluation,

A. Fioraldi, A. Mantovani, D. Maier, and D. Balzarotti, “Dissecting american fuzzy lop: a fuzzbench evaluation,” ACM transactions on software engineering and methodology , vol. 32, no. 2, pp. 1–26, 2023

2023
[22]

Smart greybox fuzzing,

V .-T. Pham, M. B ¨ohme, A. E. Santosa, A. R. C ˘aciulescu, and A. Roy- choudhury, “Smart greybox fuzzing,” IEEE Transactions on Software Engineering, vol. 47, no. 9, pp. 1980–1997, 2019

1980
[23]

Superion: Grammar-aware greybox fuzzing,

J. Wang, B. Chen, L. Wei, and Y . Liu, “Superion: Grammar-aware greybox fuzzing,” in 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE) . IEEE, 2019, pp. 724–735

2019
[24]

Libfuzzer — a library for coverage-guided fuzz testing,

LLVM Project, “Libfuzzer — a library for coverage-guided fuzz testing,” https://llvm.org/docs/LibFuzzer.html, accessed: 2026-03-29

2026
[25]

GPT-4 Technical Report

J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat et al. , “Gpt-4 technical report,” arXiv preprint arXiv:2303.08774 , 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[26]

Code Llama: Open Foundation Models for Code

B. Roziere, J. Gehring, F. Gloeckle, S. Sootla, I. Gat, X. E. Tan, Y . Adi, J. Liu, R. Sauvestre, T. Remez et al. , “Code llama: Open foundation models for code,” arXiv preprint arXiv:2308.12950 , 2023

work page internal anchor Pith review arXiv 2023
[27]

Evaluating Large Language Models Trained on Code

M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. D. O. Pinto, J. Kaplan, H. Edwards, Y . Burda, N. Joseph, G. Brockman et al., “Evaluating large language models trained on code,” arXiv preprint arXiv:2107.03374 , 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[28]

Towards reliable llm-driven fuzz testing: Vision and road ahead,

Y . Cheng, H. J. Kang, L. K. Shar, C. Dong, Z. Shi, S. Lv, and L. Sun, “Towards reliable llm-driven fuzz testing: Vision and road ahead,” arXiv preprint arXiv:2503.00795, 2025

work page arXiv 2025
[29]

A review on code generation with llms: Application and evaluation,

J. Wang and Y . Chen, “A review on code generation with llms: Application and evaluation,” in 2023 IEEE International Conference on Medical Artificial Intelligence (MedAI) . IEEE, 2023, pp. 284–289

2023
[30]

Using large language models to generate junit tests: An empirical study,

M. L. Siddiq, J. C. Da Silva Santos, R. H. Tanvir, N. Ulfat, F. Al Rifat, and V . Carvalho Lopes, “Using large language models to generate junit tests: An empirical study,” in Proceedings of the 28th international conference on evaluation and assessment in software engineering , 2024, pp. 313–322

2024
[31]

An empirical evaluation of using large language models for automated unit test generation,

M. Sch ¨afer, S. Nadi, A. Eghbali, and F. Tip, “An empirical evaluation of using large language models for automated unit test generation,” IEEE Transactions on Software Engineering, vol. 50, no. 1, pp. 85–105, 2024

2024
[32]

Can llm generate regression tests for software commits?

J. Liu, S. Lee, E. Losiouk, and M. B ¨ohme, “Can llm generate regression tests for software commits?” arXiv preprint arXiv:2501.11086 , 2025

work page arXiv 2025
[33]

arXiv preprint arXiv:2304.00385 , year=

C. S. Xia and L. Zhang, “Keep the conversation going: Fixing 162 out of 337 bugs for $0.42 each using chatgpt,” arXiv preprint arXiv:2304.00385, 2023

work page arXiv 2023
[34]

Fuzz4all: Universal fuzzing with large language models,

C. S. Xia, M. Paltenghi, J. L. Tian, M. Pradel, and L. Zhang, “Fuzz4all: Universal fuzzing with large language models,” in Proceedings of the 46th IEEE/ACM International Conference on Software Engineering , 2024, pp. 126:1–126:13

2024
[35]

Large language model guided protocol fuzzing,

R. Meng, M. Mirchev, M. B ¨ohme, and A. Roychoudhury, “Large language model guided protocol fuzzing,” in Proceedings of the 31st Annual Network and Distributed System Security Symposium (NDSS) , vol. 2024, 2024

2024
[36]

Augmenting greybox fuzzing with generative AI,

J. Hu, Q. Zhang, and H. Yin, “Augmenting greybox fuzzing with generative ai,” arXiv preprint arXiv:2306.06782 , 2023

work page arXiv 2023
[37]

Fuzzing javascript interpreters with coverage-guided reinforcement learning for llm-based mutation,

J. Eom, S. Jeong, and T. Kwon, “Fuzzing javascript interpreters with coverage-guided reinforcement learning for llm-based mutation,” in Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis , 2024, pp. 1656–1668

2024
[38]

Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models,

Y . Deng, C. S. Xia, H. Peng, C. Yang, and L. Zhang, “Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models,” in Proceedings of the 32nd ACM SIGSOFT interna- tional symposium on software testing and analysis , 2023, pp. 423–435

2023
[39]

Hopper: Interpretative fuzzing for libraries,

P. Chen, Y . Xie, Y . Lyu, Y . Wang, and H. Chen, “Hopper: Interpretative fuzzing for libraries,” in Proceedings of the 2023 ACM SIGSAC Confer- ence on Computer and Communications Security , 2023, pp. 1600–1614

2023
[40]

Fuzzing: A survey for roadmap,

X. Zhu, S. Wen, S. Camtepe, and Y . Xiang, “Fuzzing: A survey for roadmap,” ACM Computing Surveys (CSUR), vol. 54, no. 11s, pp. 1–36, 2022

2022
[41]

Coverage-based grey- box fuzzing as markov chain,

M. B ¨ohme, V .-T. Pham, and A. Roychoudhury, “Coverage-based grey- box fuzzing as markov chain,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security , 2016, pp. 1032–1043

2016
[42]

Devanbu, and Michael Pradel

G. Sherman and S. Nagy, “No harness, no problem: Oracle-guided harnessing for auto-generating c api fuzzing harnesses,” in 2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). IEEE Computer Society, 2025, pp. 165–177. [Online]. Available: https://doi.org/10.1109/ICSE55347.2025.00239

work page doi:10.1109/icse55347.2025.00239 2025
[43]

Evaluating fuzz testing,

G. Klees, A. Ruef, B. Cooper, S. Wei, and M. Hicks, “Evaluating fuzz testing,” in Proceedings of the 2018 ACM SIGSAC conference on computer and communications security , 2018, pp. 2123–2138

2018
[44]

Sok: From systematization to best practices in fuzz driver generation,

Q. Yan, M. Huang, H. Cao, and S. Lu, “Sok: From systematization to best practices in fuzz driver generation,” in Australasian Conference on Information Security and Privacy . Springer, 2025, pp. 348–368

2025
[45]

Rumono: Fuzz driver synthesis for rust generic apis,

Y . Zhang, J. Wu, and H. Xu, “Rumono: Fuzz driver synthesis for rust generic apis,” ACM Transactions on Software Engineering and Methodology, vol. 34, no. 6, pp. 169:1–169:28, 2025

2025
[46]

Rulf: Rust library fuzzing via api dependency graph traversal,

J. Jiang, H. Xu, and Y . Zhou, “Rulf: Rust library fuzzing via api dependency graph traversal,” in 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) . IEEE, 2021, pp. 581–592

2021
[47]

Rimfuzz: real-time impact-aware mutation for library api fuzzing,

X. Wang and L. Zhao, “Rimfuzz: real-time impact-aware mutation for library api fuzzing,” Journal of King Saud University Computer and Information Sciences, vol. 37, no. 4, p. 52, 2025

2025