Introduces a cooperative Recuse Signal for LLM agents and reports 100% recusal in a pilot when the signal is present versus 100% task completion without it.
ACM SIGOPS Operating Systems Review , volume =
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
New benchmark Scammer4U finds 54-93% critical PII leakage from frontier web agents on scam sites versus 0% on benign twins, plus a 30-point gap between verbalized suspicion and actual submission.
Introduces the CER framework to reconstruct AI-mediated losses for insurance claim support by assessing control boundaries, evidence availability, and coverage.
Standard observables fail to support delegation-scoped attribution in agentic AI systems, requiring a new gateway and common information model to bind context at execution time.
citing papers explorer
-
Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals
Introduces a cooperative Recuse Signal for LLM agents and reports 100% recusal in a pilot when the signal is present versus 100% task completion without it.
-
"I Strongly Suspect This Website Is a Scam": Benchmarking PII Leakage and Detection without Defense in Autonomous Web Agents
New benchmark Scammer4U finds 54-93% critical PII leakage from frontier web agents on scam sites versus 0% on benign twins, plus a 30-point gap between verbalized suspicion and actual submission.
-
From Control Boundary to Insurance Claim: Reconstructing AI-Mediated Losses Through the CER Framework
Introduces the CER framework to reconstruct AI-mediated losses for insurance claim support by assessing control boundaries, evidence availability, and coverage.
-
Observability for Delegated Execution in Agentic AI Systems
Standard observables fail to support delegation-scoped attribution in agentic AI systems, requiring a new gateway and common information model to bind context at execution time.