Local” exposes the agent’s own past reasoning; “Shared

“Evolving Security in LLMs: A Study of Jailbreak Attacks, Defenses” · 2025 · arXiv 2504.02080

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

cs.CR · 2026-05-29 · unverdicted · novelty 7.0

Persona Attack uses step-by-step memory injections to achieve up to 95% success in making LLMs ignore safety alignments, with effectiveness depending on model memory and instruction combinations.

Composing Verifiable Conceptual Models via Building Blocks: Towards Design-Time Verification of Agentic AI Workflows

cs.AI · 2026-06-19 · unverdicted · novelty 6.0

Agentic AI workflows can be verified at design time by composing them from reusable building blocks and checking compatibility via twelve structural rules, with reliable detection shown on flawed and transformed workflow datasets.

Architecture Matters for Multi-Agent Security

cs.MA · 2026-04-25 · unverdicted · novelty 6.0

Multi-agent AI systems are more vulnerable to attacks than single agents in most tested designs, with attack success rates varying up to 3.8 times depending on how roles, communication, and memory are structured.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Architecture Matters for Multi-Agent Security cs.MA · 2026-04-25 · unverdicted · none · ref 2
Multi-agent AI systems are more vulnerable to attacks than single agents in most tested designs, with attack success rates varying up to 3.8 times depending on how roles, communication, and memory are structured.

Local” exposes the agent’s own past reasoning; “Shared

fields

years

verdicts

representative citing papers

citing papers explorer