pith. sign in

arxiv: 2606.11755 · v1 · pith:OIVUGLOZnew · submitted 2026-06-10 · 💻 cs.SE

Acoda: Adversarial Code Obfuscation for Defending against LLM-based Analysis

Pith reviewed 2026-06-27 09:24 UTC · model grok-4.3

classification 💻 cs.SE
keywords adversarial code obfuscationLLM defensecode protectiongenetic algorithmsoftware securityintellectual propertycode analysis
0
0 comments X

The pith

Acoda uses a genetic algorithm to generate semantics-preserving obfuscations that make LLMs refuse or misinterpret code analysis.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to demonstrate that code can be systematically altered so large language models fail at their usual tasks of understanding, debugging, or detecting issues in it. This matters because LLMs are now routinely applied to code in ways that could expose proprietary logic or intellectual property. Acoda creates eight specific obfuscation transformations that respect how LLMs align for safety and break text into tokens, then lets a genetic algorithm search for the most effective combinations. Tests across seven current LLMs show the transformed code triggers refusal or error rates up to 70 percent while leaving the program's observable behavior identical to the original.

Core claim

Acoda is a genetic algorithm-based adversarial code obfuscation framework that leverages LLMs' safety alignment and token-based information processing to generate semantics-preserving obfuscated code. By iteratively optimizing obfuscation strategies, it produces adversarial samples that maximize the rate at which target LLMs refuse or misinterpret code analysis tasks. On seven state-of-the-art LLMs, this yields an attack success rate of up to 70 percent with cross-model transferability and negligible runtime cost.

What carries the argument

Genetic algorithm that iteratively selects and combines eight semantics-preserving obfuscation methods designed to exploit LLMs' safety alignment and token-level processing.

If this is right

  • Target LLMs refuse to analyze the obfuscated code or produce incorrect interpretations of its logic.
  • The transformed code executes with identical observable behavior to the original source.
  • Obfuscation strategies transfer effectively from one LLM to others.
  • The generation process adds only minimal extra computation time.
  • A new defense option exists for protecting code against automated LLM-driven reverse engineering.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar search-based obfuscation could be adapted to shield code from other semantic AI tools beyond LLMs.
  • An ongoing competition may develop between improved obfuscators and stronger LLM analyzers.
  • The technique offers a practical way for developers to limit exposure when sharing code with AI-assisted services.
  • Layering this method with conventional obfuscators might raise the bar for any automated code inspection.

Load-bearing premise

An auxiliary LLM plus four response metrics can accurately and without bias determine when a target LLM has refused or misinterpreted the obfuscated code.

What would settle it

Run the original program and its Acoda-obfuscated version on the same set of inputs and check whether every output matches exactly.

Figures

Figures reproduced from arXiv: 2606.11755 by Haodong Li, Haoyu Wang, Hongzhou Rao, Yanjie Zhao, Zikan Dong.

Figure 1
Figure 1. Figure 1: An example of vulnerability code. and let the LLM just return the original code. After embedding this block into ten selected samples and prompting both DS-Coder￾6.7B and CodeLlama-7B to analyze each sample three times, we observed that both LLMs produced refusal responses among their 30 outputs per LLM. Although the refusal ratio was below 50%, this result supports the feasibility of triggering the safety… view at source ↗
Figure 2
Figure 2. Figure 2: An example of using special tokens to affect LLM [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: An example of Vulnerability Code Injection. then embed the misleading description into the code, aiming to make the LLM misinterpret the code’s behavior during analysis. 1 # Original code: 2 def add(a, b): 3 return a + b 4 # Adding Misleading Summarization: 5 def add(a, b): 6 # This code calculates the product of a and b. 7 return a + b [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: An example of Misleading Summarization. String Obfuscation: As shown in [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: An example of String Obfuscation. Try–Except Wrapping: As shown in [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: An example of Try–Except Wrapping. EOS Token Insertion: As shown in [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: An example of EOS Token Insertion. 3.2 The Acoda Framework We now present our adversarial framework, Acoda, whose overall workflow is illustrated in [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The workflow of Acoda. generation time. Therefore, we select 3 target LLMs as a balance between effectiveness and efficiency. 3.2.3 Quantitative Analysis. After obtaining the deobfuscation re￾sults, we conduct a quantitative analysis. To achieve this, we adopt four metrics proposed in [3], with modifications in their computa￾tion. The definitions of these metrics are as follows: • Syntax Score (SyS): This … view at source ↗
Figure 9
Figure 9. Figure 9: The workflow of benchmark construction Next, we construct the benchmark based on CodeNet. As illus￾trated in [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Transferability results of adversarial samples. [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Case study: the DS-R1 identifies obfuscated SQL [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Ablation study of obfuscation method effective [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗
read the original abstract

With the widespread adoption of Large Language Models (LLMs) in software engineering (SE) tasks such as code understanding, debugging, and vulnerability detection, their powerful semantic reasoning ability has also introduced new security and privacy risks. LLMs can analyze, reconstruct, or even reverse-engineer source code logic, potentially leading to the leakage of intellectual property. To address this issue, we propose Acoda, a genetic algorithm-based adversarial code obfuscation framework that defends against LLM-based code analysis. Acoda leverages two key mechanisms of LLMs, namely safety alignment and token-based information processing, to design 8 semantics-preserving obfuscation methods. It iteratively optimizes obfuscation strategies through a genetic algorithm to generate adversarial samples that maximize defensive effectiveness. In addition, we propose a quantitative evaluation framework based on LLM responses, which combines an auxiliary LLM and four evaluation metrics to assess how target LLMs analyze obfuscated code comprehensively. Experimental results show that Acoda can effectively induce LLMs to refuse or misinterpret code analysis. On 7 state-of-the-art LLMs, including GPT-4o, DeepSeek, Qwen, Llama, and Gemma, Acoda achieves an attack success rate (ASR) of up to 70%, with strong cross-model transferability and minimal runtime overhead, while ensuring that the semantics of the original code remain unchanged. Overall, this study provides a new perspective for code protection and LLM security defense in the era of LLMs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes Acoda, a genetic algorithm-based framework for generating adversarial code obfuscations. It designs eight semantics-preserving obfuscation methods that exploit LLM safety alignment and token processing, then uses a GA to optimize combinations that maximize defensive effectiveness against LLM code analysis. A quantitative evaluation framework employing an auxiliary LLM and four metrics is introduced to measure attack success rate (ASR), with experiments on seven LLMs (GPT-4o, DeepSeek, Qwen, Llama, Gemma) reporting up to 70% ASR, cross-model transferability, minimal overhead, and unchanged semantics.

Significance. If the central claims hold under rigorous validation, the work provides a concrete, optimization-driven defense for protecting source code IP against LLM-based reverse engineering in SE tasks. The combination of GA search with multi-metric LLM-as-judge evaluation and explicit cross-model testing represents a practical contribution to adversarial robustness in code analysis.

major comments (2)
  1. [Abstract / Evaluation Framework] Abstract and Evaluation Framework: The reported ASR of up to 70% and cross-model transferability rest entirely on judgments from an auxiliary LLM using four unspecified metrics for refusal/misinterpretation. No human inter-rater agreement, ablation on auxiliary-model choice, or checks for shared training data/alignment bias with target models are described, making the quantitative results vulnerable to circularity and preventing independent verification of the central claim.
  2. [Methods] Methods: Concrete definitions of the eight obfuscation methods, the GA fitness function (including how semantic preservation and defensive effectiveness are quantified), baseline comparisons, and experimental controls (dataset size, number of generations, mutation/crossover rates) are not supplied in sufficient detail to reproduce or assess the optimization process and the 70% ASR result.
minor comments (1)
  1. [Abstract] The abstract states that semantics remain unchanged but provides no explicit metric or verification procedure (e.g., test-suite equivalence or diff-based checks) used to enforce this constraint during GA evolution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We will revise the manuscript to address the concerns about evaluation rigor and methodological detail, thereby improving reproducibility and mitigating risks of bias in the reported results.

read point-by-point responses
  1. Referee: [Abstract / Evaluation Framework] Abstract and Evaluation Framework: The reported ASR of up to 70% and cross-model transferability rest entirely on judgments from an auxiliary LLM using four unspecified metrics for refusal/misinterpretation. No human inter-rater agreement, ablation on auxiliary-model choice, or checks for shared training data/alignment bias with target models are described, making the quantitative results vulnerable to circularity and preventing independent verification of the central claim.

    Authors: We agree this is a substantive limitation in the current presentation. In the revised version we will explicitly name and define the four metrics, report human inter-rater agreement on a sampled subset of judgments, add an ablation varying the auxiliary LLM, and include a discussion of steps taken to reduce shared-data/alignment bias (e.g., use of models from distinct providers and families). These changes will strengthen the central claim. revision: yes

  2. Referee: [Methods] Methods: Concrete definitions of the eight obfuscation methods, the GA fitness function (including how semantic preservation and defensive effectiveness are quantified), baseline comparisons, and experimental controls (dataset size, number of generations, mutation/crossover rates) are not supplied in sufficient detail to reproduce or assess the optimization process and the 70% ASR result.

    Authors: We concur that the Methods section currently lacks the granularity needed for reproduction. The revision will supply: explicit definitions and pseudocode for each of the eight obfuscation transformations; the exact GA fitness function with formulas quantifying semantic preservation (via automated test-suite equivalence) and defensive effectiveness; comparisons to standard obfuscation baselines; and all experimental hyperparameters (dataset sizes, generations, population size, mutation/crossover rates, and selection criteria). revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain; empirical evaluation framework is self-contained

full rationale

The paper describes an empirical genetic-algorithm obfuscation method and an auxiliary-LLM evaluation protocol with four metrics, but contains no equations, fitted parameters presented as predictions, self-citation load-bearing uniqueness theorems, or ansatzes smuggled via prior work. All reported ASR, transferability, and semantic-preservation numbers are experimental outcomes on external target models rather than quantities derived by construction from the paper's own definitions or inputs. The evaluation framework is an experimental design choice whose validity is open to external critique, yet it does not reduce the central claims to a self-referential loop.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Review performed on abstract only; full paper not available so ledger is limited to claims stated in the abstract.

axioms (2)
  • domain assumption The eight obfuscation methods preserve original code semantics
    Stated directly in the abstract as a requirement for the defense.
  • domain assumption LLM responses can be reliably scored for refusal or misinterpretation using an auxiliary LLM and four metrics
    Core to the proposed quantitative evaluation framework.

pith-pipeline@v0.9.1-grok · 5804 in / 1275 out tokens · 19132 ms · 2026-06-27T09:24:43.597704+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 14 canonical work pages · 2 internal anchors

  1. [1]

    Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, et al. 2022. Constitutional ai: Harmlessness from ai feedback.arXiv preprint arXiv:2212.08073(2022)

  2. [2]

    David Beste, Grégoire Menguy, Hossein Hajipour, Mario Fritz, Antonio Emanuele Cinà, Sébastien Bardin, Thorsten Holz, Thorsten Eisenhofer, and Lea Schönherr

  3. [3]

    InInternational Conference on Detection of Intrusions and Malware, and Vulnerability Assessment

    Exploring the Potential of LLMs for Code Deobfuscation. InInternational Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 267–286

  4. [4]

    Guoqiang Chen, Xin Jin, and Zhiqiang Lin. 2025. JsDeObsBench: Measur- ing and Benchmarking LLMs for JavaScript Deobfuscation.arXiv preprint arXiv:2506.20170(2025)

  5. [5]

    Zimin Chen, Sen Fang, and Martin Monperrus. 2024. Supersonic: Learning to generate source code optimizations in C/C++.IEEE Transactions on Software Engineering(2024)

  6. [6]

    Byunggeon Choi, Hongjoo Jin, Dong Hoon Lee, and Wonsuk Choi. 2024. Chat- DEOB: An Effective Deobfuscation Method Based on Large Language Model. In International Conference on Information Security Applications. Springer, 151–163

  7. [7]

    Christian Collberg. [n. d.]. Tigress: Transformations for C Programs. https: //tigress.wtf/. Accessed: 2025-01-10

  8. [8]

    Xueying Du, Geng Zheng, Kaixin Wang, Yi Zou, Yujia Wang, Wentai Deng, Jiayi Feng, Mingwei Liu, Bihuan Chen, Xin Peng, et al. 2024. Vul-rag: Enhanc- ing llm-based vulnerability detection via knowledge-level rag.arXiv preprint arXiv:2406.11147(2024)

  9. [9]

    Sarah Fakhoury, Aaditya Naik, Georgios Sakkas, Saikat Chakraborty, and Shu- vendu K Lahiri. 2024. Llm-based test-driven interactive code generation: User study and empirical evaluation.IEEE Transactions on Software Engineering(2024)

  10. [10]

    Guardsquare. [n. d.]. ProGuard. https://www .guardsquare.com/proguard. Ac- cessed: 2025-01-10

  11. [11]

    Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Peiyi Wang, Qihao Zhu, Runxin Xu, Ruoyu Zhang, Shirong Ma, Xiao Bi, et al. 2025. Deepseek-r1 incen- tivizes reasoning in llms through reinforcement learning.Nature645, 8081 (2025), 633–638

  12. [12]

    Yuejun Guo, Constantinos Patsakis, Qiang Hu, Qiang Tang, and Fran Casino. 2024. Outside the comfort zone: Analysing llm capabilities in software vulnerability detection. InEuropean symposium on research in computer security. Springer, 271–289

  13. [13]

    Michael Hassid, Tal Remez, Jonas Gehring, Roy Schwartz, and Yossi Adi. 2024. The larger the better? improved llm code-generation via budget reallocation. arXiv preprint arXiv:2404.00725(2024)

  14. [14]

    Jingxuan He and Martin Vechev. 2023. Large language models for code: Secu- rity hardening and adversarial testing. InProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security. 1865–1879

  15. [15]

    Jingxuan He, Mark Vero, Gabriela Krasnopolska, and Martin Vechev. 2024. In- struction tuning for secure code generation.arXiv preprint arXiv:2402.09497 (2024)

  16. [16]

    Peiwei Hu, Ruigang Liang, and Kai Chen. 2024. Degpt: Optimizing decompiler output with llm. InProceedings 2024 Network and Distributed System Security Symposium, Vol. 267622140

  17. [17]

    JavaScript Obfuscator Contributors. [n. d.]. JavaScript Obfuscator. https: //github.com/javascript-obfuscator/javascript-obfuscator. Accessed: 2025-01-10

  18. [18]

    Shan Jiang, Pranoy Kovuri, David Tao, and Zhixun Tan. 2025. Cascade: Llm- powered javascript deobfuscator at google.arXiv preprint arXiv:2507.17691(2025)

  19. [19]

    Sathvik Joel, Jie Wu, and Fatemeh Fard. 2024. A survey on llm-based code generation for low-resource and domain-specific programming languages.ACM Transactions on Software Engineering and Methodology(2024)

  20. [20]

    Marie-Anne Lachaux, Baptiste Roziere, Marc Szafraniec, and Guillaume Lample

  21. [21]

    Advances in Neural Information Processing Systems34 (2021), 14967–14979

    DOBF: A deobfuscation pre-training objective for programming languages. Advances in Neural Information Processing Systems34 (2021), 14967–14979

  22. [22]

    Dong Li, Meng Yan, Yaosheng Zhang, Zhongxin Liu, Chao Liu, Xiaohong Zhang, Ting Chen, and David Lo. 2024. CoSec: On-the-Fly security hardening of code LLMs via supervised co-decoding. InProceedings of the 33rd ACM SIGSOFT Inter- national Symposium on Software Testing and Analysis. 1428–1439

  23. [23]

    Yalan Lin, Chengcheng Wan, Yixiong Fang, and Xiaodong Gu. 2024. CodeCipher: Learning to Obfuscate Source Code Against LLMs.arXiv preprint arXiv:2410.05797 (2024)

  24. [24]

    Guilong Lu, Xiaolin Ju, Xiang Chen, Wenlong Pei, and Zhilong Cai. 2024. GRACE: Empowering LLM-based software vulnerability detection with graph structure and in-context learning.Journal of Systems and Software212 (2024), 112031

  25. [25]

    Meta Platforms, Inc. [n. d.]. LLaMA Responsible Use Guide. https:// www.llama.com/docs/how-to-guides/responsible-use-guide-resources. Ac- cessed: 2025-01-10

  26. [26]

    Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, and Brad Myers. 2024. Using an llm to help with code understanding. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13

  27. [27]

    Mahmoud Nazzal, Issa Khalil, Abdallah Khreishah, and NhatHai Phan. 2024. Promsec: Prompt optimization for secure generation of functional source code with large language models (llms). InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. 2266–2280

  28. [28]

    Yu Nong, Richard Fang, Guangbei Yi, Kunsong Zhao, Xiapu Luo, Feng Chen, and Haipeng Cai. 2024. Vgx: Large-scale sample generation for boosting learning- based software vulnerability analyses. InProceedings of the IEEE/ACM 46th Inter- national Conference on Software Engineering. 1–13

  29. [29]

    Obfuscator-LLVM Contributors. [n. d.]. Obfuscator-LLVM. https://github .com/ obfuscator-llvm/obfuscator. Accessed: 2025-01-10

  30. [30]

    Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback.Advances in neural information processing systems35 (2022), 27730–27744

  31. [31]

    Ruchir Puri, David S Kung, Geert Janssen, Wei Zhang, Giacomo Domeniconi, Vladimir Zolotov, Julian Dolby, Jie Chen, Mihir Choudhury, Lindsey Decker, et al

  32. [32]

    Codenet: A large-scale ai for code dataset for learning a diversity of coding tasks.arXiv preprint arXiv:2105.12655(2021)

  33. [33]

    Shuo Ren, Daya Guo, Shuai Lu, Long Zhou, Shujie Liu, Duyu Tang, Neel Sundare- san, Ming Zhou, Ambrosio Blanco, and Shuai Ma. 2020. CodeBLEU: a Method for Automatic Evaluation of Code Synthesis. arXiv:2009.10297 [cs.SE]

  34. [34]

    CodeGemma Team, Heri Zhao, Jeffrey Hui, Joshua Howland, Nam Nguyen, Siqi Zuo, Andrea Hu, Christopher A Choquette-Choo, Jingyue Shen, Joe Kelley, et al. 2024. Codegemma: Open code models based on gemma.arXiv preprint arXiv:2406.11409(2024)

  35. [35]

    Runchu Tian, Yining Ye, Yujia Qin, Xin Cong, Yankai Lin, Yinxu Pan, Yesai Wu, Haotian Hui, Weichuan Liu, Zhiyuan Liu, et al. 2024. Debugbench: Evaluating debugging capability of large language models.arXiv preprint arXiv:2401.04621 (2024)

  36. [36]

    Anton Tkachenko, Dmitrij Suskevic, and Benjamin Adolphi. 2025. Deconstructing Obfuscation: A four-dimensional framework for evaluating Large Language Models assembly code deobfuscation capabilities.arXiv preprint arXiv:2505.19887 (2025)

  37. [37]

    Tree-sitter Contributors. [n. d.]. Tree-sitter. https://tree-sitter .github.io/tree- sitter/. Accessed: 2025-01-10

  38. [38]

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need.Advances in neural information processing systems30 (2017)

  39. [39]

    Nalin Wadhwa, Jui Pradhan, Atharv Sonwane, Surya Prakash Sahu, Nagarajan Natarajan, Aditya Kanade, Suresh Parthasarathy, and Sriram Rajamani. 2024. Core: Resolving code quality issues using llms.Proceedings of the ACM on Software Engineering1, FSE (2024), 789–811

  40. [40]

    Alperen Yildiz, Sin G Teo, Yiling Lou, Yebo Feng, Chong Wang, and Dinil M Divakaran. 2025. Benchmarking LLMs and LLM-based Agents in Practical Vul- nerability Detection for Code Repositories.arXiv preprint arXiv:2503.03586(2025)

  41. [41]

    JD Zamfirescu-Pereira, Eunice Jun, Michael Terry, Qian Yang, and Björn Hart- mann. 2025. Beyond code generation: Llm-supported exploration of the program design space. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–17

  42. [42]

    Boyu Zhang, Tianyu Du, Junkai Tong, Xuhong Zhang, Kingsum Chow, Sheng Cheng, Xun Wang, and Jianwei Yin. 2024. SecCoder: Towards Generalizable and Robust Secure Code Generation.arXiv preprint arXiv:2410.01488(2024)

  43. [43]

    Huangzhao Zhang, Kechi Zhang, Zhuo Li, Jia Li, Jia Li, Yongmin Li, Yunfei Zhao, Yuqi Zhu, Fang Liu, Ge Li, et al. 2024. Deep learning for code generation: a survey. Science China Information Sciences67, 9 (2024), 191101

  44. [44]

    Yichi Zhang. 2024. Detecting code comment inconsistencies using llm and pro- gram analysis. InCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering. 683–685