Formal Policy Enforcement for Real-World Agentic Systems

· 2026 · cs.CR · arXiv 2602.16708

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

open full Pith review browse 9 citing papers arXiv PDF

abstract

Security policy enforcement in contemporary agentic systems predominantly consists of embedding natural-language policies within an agent's system prompt and delegating compliance to the agent's reasoning. This approach admits no formal enforcement guarantee and cannot express policies whose satisfaction depends on the causal history of an execution, a gap that becomes acute in multi-agent systems, where enforcement must reason across agents. We argue that policy enforcement in agentic systems is most naturally understood as a cross-cutting concern, and propose a framework grounded in aspect-oriented programming that specifies policies independent of the agent's reasoning and enforces them at every policy-relevant decision. Policies are written in Datalog over a set of abstract predicates describing the execution context, an observability service governed by a formal assume/guarantee contract maintains these predicates, and a reference monitor consults the policy at each action to produce an enforcement decision. When the environment contract holds, enforcement decisions coincide with the policy's intended semantics. We adopt Datalog as the policy language, a natural fit because it supports declarative rule specification, admits recursion for policies over transitive relationships, and yields deterministic enforcement. Datalog further admits tractable static analyses for contradiction, redundancy, subsumption, and conditional reachability, enabling authors to verify policy intent and surface ambiguities inherent in natural-language specifications. We realize the framework in FORGE, which enforces policies over agentic deployments without modification to the underlying agents. We evaluate FORGE on three case studies: information flow policies for prompt injection defense, approval workflows in a multi-agent pharmacovigilance system, and organizational policies for customer service.

representative citing papers

Anumati: Proof of Adherence as a Formal Consent Model for Autonomous Agent Protocols

cs.CR · 2026-04-16 · unverdicted · novelty 7.0

Anumati defines proof of adherence via versioned PolicyDocument, ConsentRecord, and AdherenceEvent primitives as a non-breaking extension to A2A and MCP protocols.

Causality Laundering: Denial-Feedback Leakage in Tool-Calling LLM Agents

cs.CR · 2026-04-05 · unverdicted · novelty 7.0

The paper defines causality laundering as an attack leaking information from denial outcomes in LLM tool calls and proposes the Agentic Reference Monitor to block it using denial-aware provenance graphs.

SOCpilot: Verifying Policy Compliance for LLM-Assisted Incident Response

cs.CR · 2026-05-06 · unverdicted · novelty 6.0

SOCpilot supplies a fixed verifier and public artifact that removes 466 non-compliant approval-gated actions from LLM plans on 200 real incidents while preserving task recall.

An AI Agent Execution Environment to Safeguard User Data

cs.CR · 2026-04-21 · unverdicted · novelty 6.0

GAAP guarantees confidentiality of private user data for AI agents by enforcing user-specified permissions deterministically through persistent information flow tracking, without trusting the agent or requiring attack-free models.

Owner-Harm: A Missing Threat Model for AI Agent Safety

cs.CR · 2026-04-20 · unverdicted · novelty 6.0

Owner-Harm is a new threat model with eight categories of agent behavior that harms the deployer, and existing defenses achieve only 14.8% true positive rate on injection-based owner-harm tasks versus 100% on generic criminal harm.

Engineering Robustness into Personal Agents with the AI Workflow Store

cs.CR · 2026-05-11 · unverdicted · novelty 5.0 · 2 refs

AI agents should shift from on-the-fly plan synthesis to invoking pre-engineered, tested, and reusable workflows stored in an AI Workflow Store to gain reliability and security.

Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation

cs.CR · 2026-05-07 · unverdicted · novelty 5.0

A TEE-backed architecture isolates security-critical decisions in self-hosted AI agents to prevent host-level abuse from malicious inputs while maintaining allowed functionality.

Authorization Propagation in Multi-Agent AI Systems: Identity Governance as Infrastructure

cs.AI · 2026-05-06 · unverdicted · novelty 5.0

Multi-agent AI creates an authorization propagation problem not solved by prompt injection defenses or classical access control, requiring identity governance as continuously enforced infrastructure.

Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility

cs.SE · 2026-04-16 · unverdicted · novelty 5.0

Symbolic guardrails enforce 74% of specified safety policies in agent benchmarks and boost safety without hurting utility.

citing papers explorer

Showing 9 of 9 citing papers.

Anumati: Proof of Adherence as a Formal Consent Model for Autonomous Agent Protocols cs.CR · 2026-04-16 · unverdicted · none · ref 17 · internal anchor
Anumati defines proof of adherence via versioned PolicyDocument, ConsentRecord, and AdherenceEvent primitives as a non-breaking extension to A2A and MCP protocols.
Causality Laundering: Denial-Feedback Leakage in Tool-Calling LLM Agents cs.CR · 2026-04-05 · unverdicted · none · ref 21 · internal anchor
The paper defines causality laundering as an attack leaking information from denial outcomes in LLM tool calls and proposes the Agentic Reference Monitor to block it using denial-aware provenance graphs.
SOCpilot: Verifying Policy Compliance for LLM-Assisted Incident Response cs.CR · 2026-05-06 · unverdicted · none · ref 36 · internal anchor
SOCpilot supplies a fixed verifier and public artifact that removes 466 non-compliant approval-gated actions from LLM plans on 200 real incidents while preserving task recall.
An AI Agent Execution Environment to Safeguard User Data cs.CR · 2026-04-21 · unverdicted · none · ref 53 · internal anchor
GAAP guarantees confidentiality of private user data for AI agents by enforcing user-specified permissions deterministically through persistent information flow tracking, without trusting the agent or requiring attack-free models.
Owner-Harm: A Missing Threat Model for AI Agent Safety cs.CR · 2026-04-20 · unverdicted · none · ref 10 · internal anchor
Owner-Harm is a new threat model with eight categories of agent behavior that harms the deployer, and existing defenses achieve only 14.8% true positive rate on injection-based owner-harm tasks versus 100% on generic criminal harm.
Engineering Robustness into Personal Agents with the AI Workflow Store cs.CR · 2026-05-11 · unverdicted · none · ref 40 · 2 links · internal anchor
AI agents should shift from on-the-fly plan synthesis to invoking pre-engineered, tested, and reusable workflows stored in an AI Workflow Store to gain reliability and security.
Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation cs.CR · 2026-05-07 · unverdicted · none · ref 35 · internal anchor
A TEE-backed architecture isolates security-critical decisions in self-hosted AI agents to prevent host-level abuse from malicious inputs while maintaining allowed functionality.
Authorization Propagation in Multi-Agent AI Systems: Identity Governance as Infrastructure cs.AI · 2026-05-06 · unverdicted · none · ref 16 · internal anchor
Multi-agent AI creates an authorization propagation problem not solved by prompt injection defenses or classical access control, requiring identity governance as continuously enforced infrastructure.
Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility cs.SE · 2026-04-16 · unverdicted · none · ref 53 · internal anchor
Symbolic guardrails enforce 74% of specified safety policies in agent benchmarks and boost safety without hurting utility.

Formal Policy Enforcement for Real-World Agentic Systems

fields

years

verdicts

representative citing papers

citing papers explorer