Confusedpilot: Confused deputy risks in rag-based llms

Ayush RoyChowdhury, Mulong Luo, Prateek Sahu, Sarbartha Banerjee, Mohit Tiwari · 2024 · arXiv 2408.04870

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Oracle Poisoning: Corrupting Knowledge Graphs to Weaponise AI Agent Reasoning

cs.CR · 2026-05-10 · unverdicted · novelty 8.0

Oracle Poisoning corrupts knowledge graphs used by AI agents via tool calls, leading tested models to accept fabricated claims at 100% under directed queries in a production-scale demonstration.

Do Coding Agents Understand Least-Privilege Authorization?

cs.CR · 2026-05-14 · unverdicted · novelty 7.0

Coding agents struggle to infer least-privilege file permissions by omitting needed accesses while granting unused or sensitive ones, but Sufficiency-Tightness Decomposition improves sensitive-task success by up to 15.8% and reduces attacks.

SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces

cs.CR · 2026-05-12 · unverdicted · novelty 7.0

SkillSafetyBench is a benchmark of 155 cases across 47 tasks and 6 risk domains showing that non-user attacks via skills, artifacts, or environments can consistently induce unsafe agent behavior.

Trust Me, Import This: Dependency Steering Attacks via Malicious Agent Skills

cs.CR · 2026-05-10 · unverdicted · novelty 7.0

Malicious Skills induce coding agents to hallucinate and import attacker-controlled packages at high rates while evading detection.

RAGShield: Detecting Numerical Claim Manipulation in Government RAG Systems

cs.CR · 2026-04-01 · conditional · novelty 6.0

RAGShield detects all numerical manipulations in government RAG systems via pattern-based value extraction and cross-source verification, achieving 0% attack success rate on 430 real IRS-derived attacks where embedding defenses miss 79-90%.

Security Considerations for Artificial Intelligence Agents

cs.LG · 2026-03-12 · unverdicted · novelty 3.0

Frontier AI agents introduce new confidentiality, integrity, and availability risks through changed assumptions on code-data separation and authority boundaries, requiring layered defenses like sandboxing and policy enforcement.

citing papers explorer

Showing 6 of 6 citing papers.

Oracle Poisoning: Corrupting Knowledge Graphs to Weaponise AI Agent Reasoning cs.CR · 2026-05-10 · unverdicted · none · ref 21
Oracle Poisoning corrupts knowledge graphs used by AI agents via tool calls, leading tested models to accept fabricated claims at 100% under directed queries in a production-scale demonstration.
Do Coding Agents Understand Least-Privilege Authorization? cs.CR · 2026-05-14 · unverdicted · none · ref 7
Coding agents struggle to infer least-privilege file permissions by omitting needed accesses while granting unused or sensitive ones, but Sufficiency-Tightness Decomposition improves sensitive-task success by up to 15.8% and reduces attacks.
SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces cs.CR · 2026-05-12 · unverdicted · none · ref 75
SkillSafetyBench is a benchmark of 155 cases across 47 tasks and 6 risk domains showing that non-user attacks via skills, artifacts, or environments can consistently induce unsafe agent behavior.
Trust Me, Import This: Dependency Steering Attacks via Malicious Agent Skills cs.CR · 2026-05-10 · unverdicted · none · ref 29
Malicious Skills induce coding agents to hallucinate and import attacker-controlled packages at high rates while evading detection.
RAGShield: Detecting Numerical Claim Manipulation in Government RAG Systems cs.CR · 2026-04-01 · conditional · none · ref 12
RAGShield detects all numerical manipulations in government RAG systems via pattern-based value extraction and cross-source verification, achieving 0% attack success rate on 430 real IRS-derived attacks where embedding defenses miss 79-90%.
Security Considerations for Artificial Intelligence Agents cs.LG · 2026-03-12 · unverdicted · none · ref 46
Frontier AI agents introduce new confidentiality, integrity, and availability risks through changed assumptions on code-data separation and authority boundaries, requiring layered defenses like sandboxing and policy enforcement.

Confusedpilot: Confused deputy risks in rag-based llms

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer