STRIDE-AI: A Threat Modeling Framework for Generative AI Security Assessment
Pith reviewed 2026-05-20 14:10 UTC · model grok-4.3
The pith
STRIDE-AI adapts classical threat modeling to generative AI and cuts attack success rates from 80 percent to 15 percent in a case study.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that STRIDE-AI supplies a six-phase assessment lifecycle and an AI-adapted version of classical STRIDE threat categories that together connect high-level risk standards to technical vulnerability taxonomies; the framework is implemented through a purpose-built web tool, and a black-box assessment of a deployed LLM chatbot showed that the method reduced attack success rate from 80 percent to 15 percent inside the sandbox environment.
What carries the argument
The six-phase assessment lifecycle together with the AI-specific adaptation of the STRIDE threat categories, made usable by a dedicated web tool.
If this is right
- Organizations gain a repeatable process to map broad risk standards directly onto specific AI attack surfaces such as prompt injection and data poisoning.
- The web tool allows security teams to conduct consistent evaluations at multiple stages of AI system development and deployment.
- The adapted STRIDE categories make it possible to identify threats that arise from the probabilistic rather than deterministic character of generative models.
- The reported reduction in attack success rate indicates that structured threat modeling can materially lower exposure when applied to deployed LLM chatbots.
- The framework supplies a concrete way to move from abstract guidelines to actionable technical controls for generative AI.
Where Pith is reading between the lines
- If the six-phase lifecycle works reliably on additional systems, it could become a practical template for standardizing security reviews across the generative AI industry.
- The same adaptation of STRIDE categories might apply to other probabilistic models beyond large language models, such as diffusion or reinforcement-learning agents.
- Embedding the web tool into existing development pipelines would let teams catch AI-specific vulnerabilities earlier in the release cycle.
- Testing the framework in production environments rather than sandboxes would reveal whether the observed reduction persists under real usage and evolving attack techniques.
Load-bearing premise
A single black-box assessment of one deployed LLM chatbot inside a sandbox environment is sufficient to demonstrate that the six-phase lifecycle and adapted STRIDE method produce the reported reduction and can be generalized to other generative AI systems.
What would settle it
Repeating the black-box assessment with a different generative AI system or outside the sandbox setting and finding that the attack success rate does not drop to a comparable level would show the central claim does not hold generally.
Figures
read the original abstract
Traditional cybersecurity methodologies target deterministic systems and fail to address the probabilistic nature of AI, leaving systems vulnerable to attack vectors such as model inversion, data poisoning, and prompt injection. Recent industry reports indicate that a majority of organizations deploying AI lack a dedicated security strategy, with adversarial attacks increasing rapidly year-over-year. We present \textit{STRIDE-AI}, a framework that bridges the gap between high-level risk standards (NIST AI RMF) and technical vulnerability taxonomies (OWASP LLM Top 10). The framework defines a six-phase assessment lifecycle, introduces a threat modeling adaptation of classical STRIDE for AI systems, and is operationalized through a purpose-built web tool. We provide an initial validation of the approach through a black-box assessment of a deployed LLM chatbot, which successfully reduced the attack success rate from 80\% to 15\% in our sandbox case study.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes STRIDE-AI, a threat modeling framework for generative AI security assessment. It bridges high-level risk standards such as the NIST AI RMF with technical vulnerability taxonomies like the OWASP LLM Top 10 by defining a six-phase assessment lifecycle, adapting the classical STRIDE model to AI-specific threats (e.g., prompt injection, model inversion), and operationalizing the approach via a purpose-built web tool. An initial validation is presented through a black-box assessment of a deployed LLM chatbot in a sandbox environment, reporting a reduction in attack success rate from 80% to 15%.
Significance. If the reported reduction can be replicated with fuller experimental controls and extended to additional systems, the framework would provide a concrete, actionable bridge between abstract risk management guidelines and low-level attack vectors, addressing a documented gap in organizational AI security practices. The inclusion of a web tool adds practical value for adoption, and the adaptation of STRIDE represents a reasonable extension of established methods to probabilistic AI systems.
major comments (2)
- [Case Study / Validation] The validation reports an attack success rate reduction from 80% to 15% in the sandbox case study, but supplies no details on the attack corpus size, the distribution of prompts across threat categories, the precise baseline configuration prior to framework application, or any statistical tests or controls. This information is required to isolate the contribution of the six-phase lifecycle and adapted STRIDE from possible confounding factors and to support claims of generalizability to other generative AI systems.
- [Framework Description] The six-phase assessment lifecycle is presented at a high level without an explicit mapping or example showing how the adapted STRIDE categories are applied within each phase or how they integrate with NIST AI RMF and OWASP LLM Top 10 elements. A concrete workflow diagram or worked example would be needed to make the framework replicable.
minor comments (1)
- [Introduction] The introduction cites industry reports on the lack of dedicated AI security strategies but would benefit from specific references to those reports for traceability.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments highlight important areas for improving the rigor of the validation and the clarity of the framework description. We address each point below and outline the revisions we will make.
read point-by-point responses
-
Referee: The validation reports an attack success rate reduction from 80% to 15% in the sandbox case study, but supplies no details on the attack corpus size, the distribution of prompts across threat categories, the precise baseline configuration prior to framework application, or any statistical tests or controls. This information is required to isolate the contribution of the six-phase lifecycle and adapted STRIDE from possible confounding factors and to support claims of generalizability to other generative AI systems.
Authors: We agree that the current presentation of the case study is high-level and lacks the requested experimental details. The validation was conducted as an initial proof-of-concept in a controlled sandbox to demonstrate feasibility rather than as a comprehensive controlled experiment. In the revised manuscript we will expand the case study section to report the attack corpus size, the distribution of prompts across the adapted STRIDE-AI categories, the exact baseline configuration, and any statistical measures that were applied. We will also add an explicit discussion of limitations and the preliminary nature of the results to avoid overclaiming generalizability. revision: yes
-
Referee: The six-phase assessment lifecycle is presented at a high level without an explicit mapping or example showing how the adapted STRIDE categories are applied within each phase or how they integrate with NIST AI RMF and OWASP LLM Top 10 elements. A concrete workflow diagram or worked example would be needed to make the framework replicable.
Authors: We accept that an explicit mapping and concrete example are needed for replicability. We will add a workflow diagram that shows how each adapted STRIDE category is instantiated within the six phases and how the phases connect to specific NIST AI RMF functions and OWASP LLM Top 10 items. We will also include a worked example that walks through the application of the framework on a representative generative AI system, making the integration explicit. revision: yes
Circularity Check
No significant circularity in STRIDE-AI framework construction
full rationale
The paper presents STRIDE-AI as a constructive adaptation of classical STRIDE threat modeling to generative AI, defining a six-phase assessment lifecycle that maps high-level NIST AI RMF standards to technical OWASP LLM Top 10 vulnerabilities. This mapping and the purpose-built web tool are introduced as new organizational constructs rather than derived from fitted parameters, self-referential definitions, or load-bearing self-citations. The single black-box sandbox case study reports an empirical reduction in attack success rate but does not reduce the framework's claims to its own inputs by construction; the result is presented as initial validation of the approach, not a prediction forced by prior fits or author-specific uniqueness theorems. No equations, ansatzes smuggled via citation, or renamings of known results appear in the derivation chain. The framework remains self-contained against external benchmarks such as existing standards and taxonomies.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Traditional cybersecurity methodologies target deterministic systems and fail to address the probabilistic nature of AI.
Reference graph
Works this paper leans on
-
[1]
Explaining and harnessing adversarial examples,
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” inProceedings of the International Conference on Learning Representations (ICLR), 2015
work page 2015
-
[2]
Towards evaluating the robustness of neural networks,
N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” inProceedings of the IEEE Symposium on Security and Privacy, 2017, pp. 39–57
work page 2017
-
[3]
European Parliament and Council of the European Union, “Regulation (eu) 2024/1689 laying down harmonised rules on artificial intelligence (artificial intelligence act),” Official Journal of the European Union, 2024
work page 2024
-
[4]
Ai threat landscape report 2025,
HiddenLayer, Inc., “Ai threat landscape report 2025,” HiddenLayer, Inc., Tech. Rep., 2025. [Online]. Available: https://www.hiddenlayer.com/news/ hiddenlayer-ai-threat-landscape-report-reveals-ai-breaches-on-the-rise
work page 2025
-
[5]
Ai risk management framework (ai rmf 1.0),
National Institute of Standards and Technology, “Ai risk management framework (ai rmf 1.0),” U.S. Department of Commerce, Tech. Rep., 2023
work page 2023
-
[6]
Adversarial threat landscape for artificial-intelligence systems (atlas),
MITRE Corp., “Adversarial threat landscape for artificial-intelligence systems (atlas),” MITRE, Tech. Rep., 2024. [Online]. Available: https://atlas.mitre.org
work page 2024
-
[7]
Top 10 for large language model applications,
OWASP Foundation, “Top 10 for large language model applications,” https://genai.owasp.org/llm-top-10/, 2024, version 2025 Release
work page 2024
-
[8]
Google, “Secure ai framework (saif),” Google Cybersecurity Action Team, Tech. Rep., 2023. [Online]. Available: https://safety.google/ cybersecurity-advancements/saif/
work page 2023
-
[9]
Microsoft Security Response Center, “Ai red team building blocks,” Microsoft Corporation, Tech. Rep., 2024. [Online]. Available: https://learn.microsoft.com/en-us/security/ai-red-teaming
work page 2024
-
[10]
E. Raj,Engineering MLOps: Rapidly build, test, and manage production- ready machine learning life cycles. Packt Publishing, 2021
work page 2021
-
[11]
Pentest++: Elevating ethical hacking with ai and automation,
H. S. Al-Sinani and C. J. Mitchell, “Pentest++: Elevating ethical hacking with ai and automation,” 2025. [Online]. Available: https: //arxiv.org/abs/2502.09484
-
[12]
Shostack,Threat Modeling: Designing for Security
A. Shostack,Threat Modeling: Designing for Security. John Wiley & Sons, 2014
work page 2014
-
[13]
M.-I. Nicolae, M. Sinn, M. N. Tranet al., “Adversarial robustness toolbox v1.0.0,”arXiv preprint arXiv:1807.01069, 2018
-
[14]
garak: Llm vulnerability scanner,
NVIDIA, “garak: Llm vulnerability scanner,” https://github.com/NVIDIA/ garak, 2024
work page 2024
-
[15]
ISO/IEC, “Iso/iec 27005:2022 – information security, cybersecurity and privacy protection – guidance on managing information security risks,” International Organization for Standardization, 2022
work page 2022
-
[16]
ISO/IEC JTC 1/SC 42, “Iso/iec fdis 27090: Cybersecurity — artificial intelligence — guidance for addressing security threats and compromises to artificial intelligence systems,” International Organization for Stan- dardization, Tech. Rep., 2024, final Draft International Standard (Under Development)
work page 2024
-
[17]
K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection,” inProceedings of the ACM Workshop on Artificial Intelligence and Security (AISec), 2023
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.