Persona Attack uses step-by-step memory injections to achieve up to 95% success in making LLMs ignore safety alignments, with effectiveness depending on model memory and instruction combinations.
Local” exposes the agent’s own past reasoning; “Shared
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Agentic AI workflows can be verified at design time by composing them from reusable building blocks and checking compatibility via twelve structural rules, with reliable detection shown on flawed and transformed workflow datasets.
Multi-agent AI systems are more vulnerable to attacks than single agents in most tested designs, with attack success rates varying up to 3.8 times depending on how roles, communication, and memory are structured.
citing papers explorer
-
Composing Verifiable Conceptual Models via Building Blocks: Towards Design-Time Verification of Agentic AI Workflows
Agentic AI workflows can be verified at design time by composing them from reusable building blocks and checking compatibility via twelve structural rules, with reliable detection shown on flawed and transformed workflow datasets.