SecureForge audits LLM code for vulnerabilities, builds a synthetic prompt corpus via Markovian sampling, and optimizes system prompts to cut security issues by up to 48% while preserving unit test performance, with zero-shot transfer to real prompts.
Measuring ai agents’ progress on multi-step cyber attack scenarios
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CR 4years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
Adversarial restlessness in LLM activations allows five scalar features to detect multi-turn prompt injections at 93.8% accuracy on synthetic data, with cross-model replication but source-dependent generalization to real-world chats.
No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.
Expert-defined action plans for LLM agents achieve higher task completion in lateral-movement scenarios than fully autonomous or self-scaffolded modes, but failures remain common due to brittle commands and state handling.
citing papers explorer
-
SecureForge: Finding and Preventing Vulnerabilities in LLM-Generated Code via Prompt Optimization
SecureForge audits LLM code for vulnerabilities, builds a synthetic prompt corpus via Markovian sampling, and optimizes system prompts to cut security issues by up to 48% while preserving unit test performance, with zero-shot transfer to real prompts.
-
Latent Adversarial Detection: Adaptive Probing of LLM Activations for Multi-Turn Attack Detection
Adversarial restlessness in LLM activations allows five scalar features to detect multi-turn prompt injections at 93.8% accuracy on synthetic data, with cross-model replication but source-dependent generalization to real-world chats.
-
Security Considerations for Multi-agent Systems
No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.
-
Autonomous Adversary: Red-Teaming in the age of LLM
Expert-defined action plans for LLM agents achieve higher task completion in lateral-movement scenarios than fully autonomous or self-scaffolded modes, but failures remain common due to brittle commands and state handling.