arXiv preprint arXiv:2506.15253 , year=

· 2025 · arXiv 2506.15253

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

representative citing papers

Taxonomy and Consistency Analysis of Safety Benchmarks for AI Agents

cs.CY · 2026-04-11 · accept · novelty 8.0

This paper delivers the first systematic taxonomy and cross-benchmark consistency analysis of 40 agent safety benchmarks, finding broad but shallow risk coverage, no ranking concordance across evaluations, and that benchmark choice systematically alters reported safety.

Do Coding Agents Understand Least-Privilege Authorization?

cs.CR · 2026-05-14 · unverdicted · novelty 7.0

Coding agents struggle to infer least-privilege file permissions by omitting needed accesses while granting unused or sensitive ones, but Sufficiency-Tightness Decomposition improves sensitive-task success by up to 15.8% and reduces attacks.

Evaluating Privilege Usage of Agents with Real-World Tools

cs.CR · 2026-03-30 · unverdicted · novelty 6.0

GrantBox evaluates LLM agents using real-world tools and finds they remain vulnerable to sophisticated prompt injection attacks with an 84.80% average success rate.

ADR: An Agentic Detection System for Enterprise Agentic AI Security

cs.AI · 2026-05-17 · unverdicted · novelty 5.0

ADR is a three-component detection system for AI agents that combines telemetry sensors, red teaming, and two-tier detection, achieving 97.2% precision in a ten-month Uber deployment and outperforming baselines on the new ADR-Bench.

Content-Aware Attack Detection in LLM Agent Tool-Call Traffic: An Empirical Study of Features, Architectures, and Evaluation Protocols

cs.CR · 2026-05-11 · unverdicted · novelty 5.0 · 3 refs

Content embeddings from SBERT enable AUROC above 0.89 for attack detection in MCP tool-call sessions, with tree ensembles on pooled embeddings reaching 0.975 and outperforming GNNs when using task-stratified splits instead of random ones.

citing papers explorer

Showing 5 of 5 citing papers.

Taxonomy and Consistency Analysis of Safety Benchmarks for AI Agents cs.CY · 2026-04-11 · accept · none · ref 14
This paper delivers the first systematic taxonomy and cross-benchmark consistency analysis of 40 agent safety benchmarks, finding broad but shallow risk coverage, no ranking concordance across evaluations, and that benchmark choice systematically alters reported safety.
Do Coding Agents Understand Least-Privilege Authorization? cs.CR · 2026-05-14 · unverdicted · none · ref 49
Coding agents struggle to infer least-privilege file permissions by omitting needed accesses while granting unused or sensitive ones, but Sufficiency-Tightness Decomposition improves sensitive-task success by up to 15.8% and reduces attacks.
Evaluating Privilege Usage of Agents with Real-World Tools cs.CR · 2026-03-30 · unverdicted · none · ref 8
GrantBox evaluates LLM agents using real-world tools and finds they remain vulnerable to sophisticated prompt injection attacks with an 84.80% average success rate.
ADR: An Agentic Detection System for Enterprise Agentic AI Security cs.AI · 2026-05-17 · unverdicted · none · ref 8
ADR is a three-component detection system for AI agents that combines telemetry sensors, red teaming, and two-tier detection, achieving 97.2% precision in a ten-month Uber deployment and outperforming baselines on the new ADR-Bench.
Content-Aware Attack Detection in LLM Agent Tool-Call Traffic: An Empirical Study of Features, Architectures, and Evaluation Protocols cs.CR · 2026-05-11 · unverdicted · none · ref 18 · 3 links
Content embeddings from SBERT enable AUROC above 0.89 for attack detection in MCP tool-call sessions, with tree ensembles on pooled embeddings reaching 0.975 and outperforming GNNs when using task-stratified splits instead of random ones.

arXiv preprint arXiv:2506.15253 , year=

fields

years

verdicts

representative citing papers

citing papers explorer