Introduces a cooperative Recuse Signal for LLM agents and reports 100% recusal in a pilot when the signal is present versus 100% task completion without it.
The confused deputy: (or why capabilities might have been invented).SIGOPS Oper
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5verdicts
UNVERDICTED 5representative citing papers
New benchmark Scammer4U finds 54-93% critical PII leakage from frontier web agents on scam sites versus 0% on benign twins, plus a 30-point gap between verbalized suspicion and actual submission.
AuthGraph aligns an execution provenance graph with a clean authorization graph to detect parameter-source deviations from user intent, reducing attack success rates to 1-2% on AgentDojo and AgentDyn while retaining most task utility.
Introduces the CER framework to reconstruct AI-mediated losses for insurance claim support by assessing control boundaries, evidence availability, and coverage.
Standard observables fail to support delegation-scoped attribution in agentic AI systems, requiring a new gateway and common information model to bind context at execution time.
citing papers explorer
-
From Control Boundary to Insurance Claim: Reconstructing AI-Mediated Losses Through the CER Framework
Introduces the CER framework to reconstruct AI-mediated losses for insurance claim support by assessing control boundaries, evidence availability, and coverage.