hub Canonical reference

Formal Policy Enforcement for Real-World Agentic Systems

· 2026 · cs.CR · arXiv 2602.16708

Canonical reference. 80% of citing Pith papers cite this work as background.

16 Pith papers citing it

Background 80% of classified citations

open full Pith review browse 16 citing papers arXiv PDF

abstract

Security policy enforcement in contemporary agentic systems predominantly consists of embedding natural-language policies within an agent's system prompt and delegating compliance to the agent's reasoning. This approach admits no formal enforcement guarantee and cannot express policies whose satisfaction depends on the causal history of an execution, a gap that becomes acute in multi-agent systems, where enforcement must reason across agents. We argue that policy enforcement in agentic systems is most naturally understood as a cross-cutting concern, and propose a framework grounded in aspect-oriented programming that specifies policies independent of the agent's reasoning and enforces them at every policy-relevant decision. Policies are written in Datalog over a set of abstract predicates describing the execution context, an observability service governed by a formal assume/guarantee contract maintains these predicates, and a reference monitor consults the policy at each action to produce an enforcement decision. When the environment contract holds, enforcement decisions coincide with the policy's intended semantics. We adopt Datalog as the policy language, a natural fit because it supports declarative rule specification, admits recursion for policies over transitive relationships, and yields deterministic enforcement. Datalog further admits tractable static analyses for contradiction, redundancy, subsumption, and conditional reachability, enabling authors to verify policy intent and surface ambiguities inherent in natural-language specifications. We realize the framework in FORGE, which enforces policies over agentic deployments without modification to the underlying agents. We evaluate FORGE on three case studies: information flow policies for prompt injection defense, approval workflows in a multi-agent pharmacovigilance system, and organizational policies for customer service.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4 baseline 1

citation-polarity summary

background 4 baseline 1

representative citing papers

Anumati: Proof of Adherence as a Formal Consent Model for Autonomous Agent Protocols

cs.CR · 2026-04-16 · unverdicted · novelty 7.0

Anumati defines proof of adherence via versioned PolicyDocument, ConsentRecord, and AdherenceEvent primitives as a non-breaking extension to A2A and MCP protocols.

Causality Laundering: Denial-Feedback Leakage in Tool-Calling LLM Agents

cs.CR · 2026-04-05 · unverdicted · novelty 7.0

The paper defines causality laundering as an attack leaking information from denial outcomes in LLM tool calls and proposes the Agentic Reference Monitor to block it using denial-aware provenance graphs.

Janus: a Playground for User-Involved Agentic Permission Management

cs.AI · 2026-07-01 · unverdicted · novelty 6.0

Janus is a publicly available playground system and evaluation harness for testing user-involved permission management designs in AI agents, demonstrating benefits of user input and the need for context-sensitive approaches.

PolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM Agents

cs.AI · 2026-06-28 · unverdicted · novelty 6.0

PolicyGuard is a dialogue-grounded sub-agent verifier that raises PASS4 scores by 6-12 points on an airline benchmark while catching more violations with fewer blocks than argument-level guards.

Prompts Don't Protect: Architectural Enforcement via MCP Proxy for LLM Tool Access Control

cs.CR · 2026-05-18 · unverdicted · novelty 6.0

MCP proxy enforces ABAC for LLM tool access by filtering discovery and invocation, achieving 0% unauthorized invocation rate across tested models and attacks where prompts reduce risk by only 11-18 points.

SOCpilot: Verifying Policy Compliance for LLM-Assisted Incident Response

cs.CR · 2026-05-06 · unverdicted · novelty 6.0

SOCpilot supplies a fixed verifier and public artifact that removes 466 non-compliant approval-gated actions from LLM plans on 200 real incidents while preserving task recall.

An AI Agent Execution Environment to Safeguard User Data

cs.CR · 2026-04-21 · unverdicted · novelty 6.0

GAAP guarantees confidentiality of private user data for AI agents by enforcing user-specified permissions deterministically through persistent information flow tracking, without trusting the agent or requiring attack-free models.

Owner-Harm: A Missing Threat Model for AI Agent Safety

cs.CR · 2026-04-20 · unverdicted · novelty 6.0

Owner-Harm is a new threat model with eight categories of agent behavior that harms the deployer, and existing defenses achieve only 14.8% true positive rate on injection-based owner-harm tasks versus 100% on generic criminal harm.

Security Considerations for Multi-agent Systems

cs.CR · 2026-03-09 · unverdicted · novelty 6.0

No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.

Lingering Authority: Revocable Resource-and-Effect Capabilities for Coding Agents

cs.CR · 2026-06-21 · unverdicted · novelty 5.0

PORTICO is a revocable capability reference monitor for coding agents that enforces task contracts via grant-invoke-closure lifecycles and rejects post-closure reuses while preserving task success.

Overlaying Governance: A Compositional Authorization Framework for Delegation and Scope in Agentic AI

cs.AI · 2026-06-02 · unverdicted · novelty 5.0

Introduces a compositional governance framework defining delegation types, resource scope attenuation, and an overlay operator for agentic AI authorization policies.

Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation

cs.CR · 2026-05-07 · unverdicted · novelty 5.0

A TEE-backed architecture isolates security-critical decisions in self-hosted AI agents to prevent host-level abuse from malicious inputs while maintaining allowed functionality.

Authorization Propagation in Multi-Agent AI Systems: Identity Governance as Infrastructure

cs.AI · 2026-05-06 · unverdicted · novelty 5.0

Multi-agent AI creates an authorization propagation problem not solved by prompt injection defenses or classical access control, requiring identity governance as continuously enforced infrastructure.

Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility

cs.SE · 2026-04-16 · unverdicted · novelty 5.0

Symbolic guardrails enforce 74% of specified safety policies in agent benchmarks and boost safety without hurting utility.

Agent Security is a Systems Problem

cs.CR · 2026-05-18 · unverdicted · novelty 4.0 · 2 refs

The paper argues that agent security is best addressed as a systems problem by applying principles from operating systems, networks, and formal methods rather than relying solely on model robustness improvements.

Engineering Robustness into Personal Agents with the AI Workflow Store

cs.CR · 2026-05-11 · unverdicted · novelty 4.0 · 3 refs

Position paper advocating a shift from on-the-fly AI agent synthesis to reusable hardened workflows in an AI Workflow Store to improve robustness and security.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Causality Laundering: Denial-Feedback Leakage in Tool-Calling LLM Agents cs.CR · 2026-04-05 · unverdicted · none · ref 21 · internal anchor
The paper defines causality laundering as an attack leaking information from denial outcomes in LLM tool calls and proposes the Agentic Reference Monitor to block it using denial-aware provenance graphs.
An AI Agent Execution Environment to Safeguard User Data cs.CR · 2026-04-21 · unverdicted · none · ref 53 · internal anchor
GAAP guarantees confidentiality of private user data for AI agents by enforcing user-specified permissions deterministically through persistent information flow tracking, without trusting the agent or requiring attack-free models.
Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility cs.SE · 2026-04-16 · unverdicted · none · ref 53 · internal anchor
Symbolic guardrails enforce 74% of specified safety policies in agent benchmarks and boost safety without hurting utility.
Engineering Robustness into Personal Agents with the AI Workflow Store cs.CR · 2026-05-11 · unverdicted · none · ref 40 · 3 links · internal anchor
Position paper advocating a shift from on-the-fly AI agent synthesis to reusable hardened workflows in an AI Workflow Store to improve robustness and security.

Formal Policy Enforcement for Real-World Agentic Systems

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer