Acoda: Adversarial Code Obfuscation for Defending against LLM-based Analysis
Pith reviewed 2026-06-27 09:24 UTC · model grok-4.3
The pith
Acoda uses a genetic algorithm to generate semantics-preserving obfuscations that make LLMs refuse or misinterpret code analysis.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Acoda is a genetic algorithm-based adversarial code obfuscation framework that leverages LLMs' safety alignment and token-based information processing to generate semantics-preserving obfuscated code. By iteratively optimizing obfuscation strategies, it produces adversarial samples that maximize the rate at which target LLMs refuse or misinterpret code analysis tasks. On seven state-of-the-art LLMs, this yields an attack success rate of up to 70 percent with cross-model transferability and negligible runtime cost.
What carries the argument
Genetic algorithm that iteratively selects and combines eight semantics-preserving obfuscation methods designed to exploit LLMs' safety alignment and token-level processing.
If this is right
- Target LLMs refuse to analyze the obfuscated code or produce incorrect interpretations of its logic.
- The transformed code executes with identical observable behavior to the original source.
- Obfuscation strategies transfer effectively from one LLM to others.
- The generation process adds only minimal extra computation time.
- A new defense option exists for protecting code against automated LLM-driven reverse engineering.
Where Pith is reading between the lines
- Similar search-based obfuscation could be adapted to shield code from other semantic AI tools beyond LLMs.
- An ongoing competition may develop between improved obfuscators and stronger LLM analyzers.
- The technique offers a practical way for developers to limit exposure when sharing code with AI-assisted services.
- Layering this method with conventional obfuscators might raise the bar for any automated code inspection.
Load-bearing premise
An auxiliary LLM plus four response metrics can accurately and without bias determine when a target LLM has refused or misinterpreted the obfuscated code.
What would settle it
Run the original program and its Acoda-obfuscated version on the same set of inputs and check whether every output matches exactly.
Figures
read the original abstract
With the widespread adoption of Large Language Models (LLMs) in software engineering (SE) tasks such as code understanding, debugging, and vulnerability detection, their powerful semantic reasoning ability has also introduced new security and privacy risks. LLMs can analyze, reconstruct, or even reverse-engineer source code logic, potentially leading to the leakage of intellectual property. To address this issue, we propose Acoda, a genetic algorithm-based adversarial code obfuscation framework that defends against LLM-based code analysis. Acoda leverages two key mechanisms of LLMs, namely safety alignment and token-based information processing, to design 8 semantics-preserving obfuscation methods. It iteratively optimizes obfuscation strategies through a genetic algorithm to generate adversarial samples that maximize defensive effectiveness. In addition, we propose a quantitative evaluation framework based on LLM responses, which combines an auxiliary LLM and four evaluation metrics to assess how target LLMs analyze obfuscated code comprehensively. Experimental results show that Acoda can effectively induce LLMs to refuse or misinterpret code analysis. On 7 state-of-the-art LLMs, including GPT-4o, DeepSeek, Qwen, Llama, and Gemma, Acoda achieves an attack success rate (ASR) of up to 70%, with strong cross-model transferability and minimal runtime overhead, while ensuring that the semantics of the original code remain unchanged. Overall, this study provides a new perspective for code protection and LLM security defense in the era of LLMs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Acoda, a genetic algorithm-based framework for generating adversarial code obfuscations. It designs eight semantics-preserving obfuscation methods that exploit LLM safety alignment and token processing, then uses a GA to optimize combinations that maximize defensive effectiveness against LLM code analysis. A quantitative evaluation framework employing an auxiliary LLM and four metrics is introduced to measure attack success rate (ASR), with experiments on seven LLMs (GPT-4o, DeepSeek, Qwen, Llama, Gemma) reporting up to 70% ASR, cross-model transferability, minimal overhead, and unchanged semantics.
Significance. If the central claims hold under rigorous validation, the work provides a concrete, optimization-driven defense for protecting source code IP against LLM-based reverse engineering in SE tasks. The combination of GA search with multi-metric LLM-as-judge evaluation and explicit cross-model testing represents a practical contribution to adversarial robustness in code analysis.
major comments (2)
- [Abstract / Evaluation Framework] Abstract and Evaluation Framework: The reported ASR of up to 70% and cross-model transferability rest entirely on judgments from an auxiliary LLM using four unspecified metrics for refusal/misinterpretation. No human inter-rater agreement, ablation on auxiliary-model choice, or checks for shared training data/alignment bias with target models are described, making the quantitative results vulnerable to circularity and preventing independent verification of the central claim.
- [Methods] Methods: Concrete definitions of the eight obfuscation methods, the GA fitness function (including how semantic preservation and defensive effectiveness are quantified), baseline comparisons, and experimental controls (dataset size, number of generations, mutation/crossover rates) are not supplied in sufficient detail to reproduce or assess the optimization process and the 70% ASR result.
minor comments (1)
- [Abstract] The abstract states that semantics remain unchanged but provides no explicit metric or verification procedure (e.g., test-suite equivalence or diff-based checks) used to enforce this constraint during GA evolution.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We will revise the manuscript to address the concerns about evaluation rigor and methodological detail, thereby improving reproducibility and mitigating risks of bias in the reported results.
read point-by-point responses
-
Referee: [Abstract / Evaluation Framework] Abstract and Evaluation Framework: The reported ASR of up to 70% and cross-model transferability rest entirely on judgments from an auxiliary LLM using four unspecified metrics for refusal/misinterpretation. No human inter-rater agreement, ablation on auxiliary-model choice, or checks for shared training data/alignment bias with target models are described, making the quantitative results vulnerable to circularity and preventing independent verification of the central claim.
Authors: We agree this is a substantive limitation in the current presentation. In the revised version we will explicitly name and define the four metrics, report human inter-rater agreement on a sampled subset of judgments, add an ablation varying the auxiliary LLM, and include a discussion of steps taken to reduce shared-data/alignment bias (e.g., use of models from distinct providers and families). These changes will strengthen the central claim. revision: yes
-
Referee: [Methods] Methods: Concrete definitions of the eight obfuscation methods, the GA fitness function (including how semantic preservation and defensive effectiveness are quantified), baseline comparisons, and experimental controls (dataset size, number of generations, mutation/crossover rates) are not supplied in sufficient detail to reproduce or assess the optimization process and the 70% ASR result.
Authors: We concur that the Methods section currently lacks the granularity needed for reproduction. The revision will supply: explicit definitions and pseudocode for each of the eight obfuscation transformations; the exact GA fitness function with formulas quantifying semantic preservation (via automated test-suite equivalence) and defensive effectiveness; comparisons to standard obfuscation baselines; and all experimental hyperparameters (dataset sizes, generations, population size, mutation/crossover rates, and selection criteria). revision: yes
Circularity Check
No circularity in derivation chain; empirical evaluation framework is self-contained
full rationale
The paper describes an empirical genetic-algorithm obfuscation method and an auxiliary-LLM evaluation protocol with four metrics, but contains no equations, fitted parameters presented as predictions, self-citation load-bearing uniqueness theorems, or ansatzes smuggled via prior work. All reported ASR, transferability, and semantic-preservation numbers are experimental outcomes on external target models rather than quantities derived by construction from the paper's own definitions or inputs. The evaluation framework is an experimental design choice whose validity is open to external critique, yet it does not reduce the central claims to a self-referential loop.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The eight obfuscation methods preserve original code semantics
- domain assumption LLM responses can be reliably scored for refusal or misinterpretation using an auxiliary LLM and four metrics
Reference graph
Works this paper leans on
-
[1]
Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, et al. 2022. Constitutional ai: Harmlessness from ai feedback.arXiv preprint arXiv:2212.08073(2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[2]
David Beste, Grégoire Menguy, Hossein Hajipour, Mario Fritz, Antonio Emanuele Cinà, Sébastien Bardin, Thorsten Holz, Thorsten Eisenhofer, and Lea Schönherr
-
[3]
InInternational Conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Exploring the Potential of LLMs for Code Deobfuscation. InInternational Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 267–286
- [4]
-
[5]
Zimin Chen, Sen Fang, and Martin Monperrus. 2024. Supersonic: Learning to generate source code optimizations in C/C++.IEEE Transactions on Software Engineering(2024)
2024
-
[6]
Byunggeon Choi, Hongjoo Jin, Dong Hoon Lee, and Wonsuk Choi. 2024. Chat- DEOB: An Effective Deobfuscation Method Based on Large Language Model. In International Conference on Information Security Applications. Springer, 151–163
2024
-
[7]
Christian Collberg. [n. d.]. Tigress: Transformations for C Programs. https: //tigress.wtf/. Accessed: 2025-01-10
2025
- [8]
-
[9]
Sarah Fakhoury, Aaditya Naik, Georgios Sakkas, Saikat Chakraborty, and Shu- vendu K Lahiri. 2024. Llm-based test-driven interactive code generation: User study and empirical evaluation.IEEE Transactions on Software Engineering(2024)
2024
-
[10]
Guardsquare. [n. d.]. ProGuard. https://www .guardsquare.com/proguard. Ac- cessed: 2025-01-10
2025
-
[11]
Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Peiyi Wang, Qihao Zhu, Runxin Xu, Ruoyu Zhang, Shirong Ma, Xiao Bi, et al. 2025. Deepseek-r1 incen- tivizes reasoning in llms through reinforcement learning.Nature645, 8081 (2025), 633–638
2025
-
[12]
Yuejun Guo, Constantinos Patsakis, Qiang Hu, Qiang Tang, and Fran Casino. 2024. Outside the comfort zone: Analysing llm capabilities in software vulnerability detection. InEuropean symposium on research in computer security. Springer, 271–289
2024
- [13]
-
[14]
Jingxuan He and Martin Vechev. 2023. Large language models for code: Secu- rity hardening and adversarial testing. InProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security. 1865–1879
2023
- [15]
-
[16]
Peiwei Hu, Ruigang Liang, and Kai Chen. 2024. Degpt: Optimizing decompiler output with llm. InProceedings 2024 Network and Distributed System Security Symposium, Vol. 267622140
2024
-
[17]
JavaScript Obfuscator Contributors. [n. d.]. JavaScript Obfuscator. https: //github.com/javascript-obfuscator/javascript-obfuscator. Accessed: 2025-01-10
2025
- [18]
-
[19]
Sathvik Joel, Jie Wu, and Fatemeh Fard. 2024. A survey on llm-based code generation for low-resource and domain-specific programming languages.ACM Transactions on Software Engineering and Methodology(2024)
2024
-
[20]
Marie-Anne Lachaux, Baptiste Roziere, Marc Szafraniec, and Guillaume Lample
-
[21]
Advances in Neural Information Processing Systems34 (2021), 14967–14979
DOBF: A deobfuscation pre-training objective for programming languages. Advances in Neural Information Processing Systems34 (2021), 14967–14979
2021
-
[22]
Dong Li, Meng Yan, Yaosheng Zhang, Zhongxin Liu, Chao Liu, Xiaohong Zhang, Ting Chen, and David Lo. 2024. CoSec: On-the-Fly security hardening of code LLMs via supervised co-decoding. InProceedings of the 33rd ACM SIGSOFT Inter- national Symposium on Software Testing and Analysis. 1428–1439
2024
- [23]
-
[24]
Guilong Lu, Xiaolin Ju, Xiang Chen, Wenlong Pei, and Zhilong Cai. 2024. GRACE: Empowering LLM-based software vulnerability detection with graph structure and in-context learning.Journal of Systems and Software212 (2024), 112031
2024
-
[25]
Meta Platforms, Inc. [n. d.]. LLaMA Responsible Use Guide. https:// www.llama.com/docs/how-to-guides/responsible-use-guide-resources. Ac- cessed: 2025-01-10
2025
-
[26]
Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, and Brad Myers. 2024. Using an llm to help with code understanding. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13
2024
-
[27]
Mahmoud Nazzal, Issa Khalil, Abdallah Khreishah, and NhatHai Phan. 2024. Promsec: Prompt optimization for secure generation of functional source code with large language models (llms). InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. 2266–2280
2024
-
[28]
Yu Nong, Richard Fang, Guangbei Yi, Kunsong Zhao, Xiapu Luo, Feng Chen, and Haipeng Cai. 2024. Vgx: Large-scale sample generation for boosting learning- based software vulnerability analyses. InProceedings of the IEEE/ACM 46th Inter- national Conference on Software Engineering. 1–13
2024
-
[29]
Obfuscator-LLVM Contributors. [n. d.]. Obfuscator-LLVM. https://github .com/ obfuscator-llvm/obfuscator. Accessed: 2025-01-10
2025
-
[30]
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback.Advances in neural information processing systems35 (2022), 27730–27744
2022
-
[31]
Ruchir Puri, David S Kung, Geert Janssen, Wei Zhang, Giacomo Domeniconi, Vladimir Zolotov, Julian Dolby, Jie Chen, Mihir Choudhury, Lindsey Decker, et al
- [32]
-
[33]
Shuo Ren, Daya Guo, Shuai Lu, Long Zhou, Shujie Liu, Duyu Tang, Neel Sundare- san, Ming Zhou, Ambrosio Blanco, and Shuai Ma. 2020. CodeBLEU: a Method for Automatic Evaluation of Code Synthesis. arXiv:2009.10297 [cs.SE]
work page internal anchor Pith review Pith/arXiv arXiv 2020
- [34]
- [35]
- [36]
-
[37]
Tree-sitter Contributors. [n. d.]. Tree-sitter. https://tree-sitter .github.io/tree- sitter/. Accessed: 2025-01-10
2025
-
[38]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need.Advances in neural information processing systems30 (2017)
2017
-
[39]
Nalin Wadhwa, Jui Pradhan, Atharv Sonwane, Surya Prakash Sahu, Nagarajan Natarajan, Aditya Kanade, Suresh Parthasarathy, and Sriram Rajamani. 2024. Core: Resolving code quality issues using llms.Proceedings of the ACM on Software Engineering1, FSE (2024), 789–811
2024
- [40]
-
[41]
JD Zamfirescu-Pereira, Eunice Jun, Michael Terry, Qian Yang, and Björn Hart- mann. 2025. Beyond code generation: Llm-supported exploration of the program design space. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–17
2025
- [42]
-
[43]
Huangzhao Zhang, Kechi Zhang, Zhuo Li, Jia Li, Jia Li, Yongmin Li, Yunfei Zhao, Yuqi Zhu, Fang Liu, Ge Li, et al. 2024. Deep learning for code generation: a survey. Science China Information Sciences67, 9 (2024), 191101
2024
-
[44]
Yichi Zhang. 2024. Detecting code comment inconsistencies using llm and pro- gram analysis. InCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering. 683–685
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.