arxiv: 2604.22136 · v1 · submitted 2026-04-24 · 💻 cs.CR · cs.LG

Recognition: unknown

Sovereign Agentic Loops: Decoupling AI Reasoning from Execution in Real-World Systems

Deying Yu, Jun He

Pith reviewed 2026-05-08 11:47 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords AI safetyLLM agentsintent validationcontrol planepolicy enforcementdeterministic replayagentic architecturecloud infrastructure

0 comments

The pith

Sovereign Agentic Loops separate LLM intent generation from execution so a control plane can validate actions against policy and system state.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that current LLM agents risk unsafe actions because stochastic outputs reach execution layers without checks. Sovereign Agentic Loops address this by having models emit structured intents with justifications that a separate control plane must approve using true state and policy. An obfuscation membrane hides identity-sensitive details from the model while a cryptographically linked Evidence Chain records everything for audit and replay. Under the paper's assumptions this yields policy-bounded execution, identity isolation, and deterministic replay. The OpenKedge prototype shows 93 percent of unsafe intents stopped at policy validation, the rest caught by consistency checks, zero unsafe executions in benchmarks, and only 12.4 ms added median latency.

Core claim

Sovereign Agentic Loops (SAL) is a control-plane architecture in which models emit structured intents with justifications; the control plane validates those intents against true system state and policy before execution. SAL combines an obfuscation membrane, which limits model access to identity-sensitive state, with a cryptographically linked Evidence Chain for auditability and replay. Under the stated assumptions, SAL provides policy-bounded execution, identity isolation, and deterministic replay.

What carries the argument

Sovereign Agentic Loops (SAL), the control-plane architecture that requires validation of model-generated structured intents against real system state and policy before any execution occurs, using an obfuscation membrane and Evidence Chain.

If this is right

Unsafe intents are blocked at the policy layer in 93 percent of cases in the OpenKedge prototype.
Remaining unsafe intents are rejected by consistency checks before execution.
No unsafe executions occur in the benchmark when the architecture is active.
The system supports deterministic replay of actions via the cryptographically linked Evidence Chain.
Identity isolation is maintained because the obfuscation membrane limits model visibility into sensitive state.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same validation-before-execution pattern could be tested in domains such as robotic control or financial trading where model outputs affect physical or monetary outcomes.
If the control plane validation logic contains errors, the entire safety guarantee would collapse, suggesting the need to verify the validator itself.
Deterministic replay enabled by the Evidence Chain opens the possibility of post-incident forensic analysis that does not rely on trusting the original model outputs.

Load-bearing premise

The control plane can accurately validate intents against true system state and policy, and the stated assumptions under which the formalization guarantees the safety properties hold in practice.

What would settle it

An experiment in which an unsafe action executes after passing SAL validation, or a recorded execution sequence cannot be deterministically replayed from the Evidence Chain.

Figures

Figures reproduced from arXiv: 2604.22136 by Deying Yu, Jun He.

**Figure 1.** Figure 1: System topology illustrating the sovereignty conflict. A foreign reasoning agent requires view at source ↗

**Figure 2.** Figure 2: Comparison between traditional agent execution and the Sovereign Agentic Loop (SAL). view at source ↗

read the original abstract

Large language model (LLM) agents increasingly issue API calls that mutate real systems, yet many current architectures pass stochastic model outputs directly to execution layers. We argue that this coupling creates a safety risk because model correctness, context awareness, and alignment cannot be assumed at execution time. We introduce Sovereign Agentic Loops (SAL), a control-plane architecture in which models emit structured intents with justifications, and the control plane validates those intents against true system state and policy before execution. SAL combines an obfuscation membrane, which limits model access to identity-sensitive state, with a cryptographically linked Evidence Chain for auditability and replay. We formalize SAL and show that, under the stated assumptions, it provides policy-bounded execution, identity isolation, and deterministic replay. In an OpenKedge prototype for cloud infrastructure, SAL blocks 93% of unsafe intents at the policy layer, rejects the remaining 7% via consistency checks, prevents unsafe executions in our benchmark, and adds 12.4 ms median latency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a validation control plane for LLM agents that mutate real systems, with a prototype showing decent blocking rates and low latency, but the safety claims depend on state observability assumptions that look fragile in concurrent settings.

read the letter

This paper introduces Sovereign Agentic Loops to decouple LLM intent generation from direct execution on real infrastructure. Models emit structured intents with justifications, and a separate control plane validates them against policy and current state before any API calls run. It adds an obfuscation membrane to limit model visibility into sensitive state and a cryptographically linked Evidence Chain for audit and replay. Under the stated assumptions, the formalization claims policy-bounded execution, identity isolation, and deterministic replay. The OpenKedge prototype reports blocking 93% of unsafe intents at the policy layer, catching the rest with consistency checks, zero unsafe executions in the benchmark, and 12.4 ms median added latency. That overhead number is practical to see for anyone running agents in production clouds. The separation of concerns and the audit trail address a real risk when stochastic outputs hit mutating systems. The design is a reasonable packaging of validation, limited access, and logging. The main soft spot is the assumption that the control plane can reliably observe true system state at validation time. In cloud environments with concurrency, async updates, or partial views, state can shift between check and execution, so an intent that looked safe could become unsafe. The reported results come from controlled, non-concurrent tests and do not address this. The formalization is only as strong as those assumptions hold in practice. This is for engineers and researchers working on safe agent deployments in enterprise or cloud settings. Readers who need concrete architectures for policy enforcement and auditability would get usable ideas from the components and the measurements. It has a clear proposal plus some empirical data, so it deserves peer review. Referees can test whether the assumptions survive real workloads and whether the prototype generalizes. I would send it out rather than desk reject.

Referee Report

2 major / 2 minor

Summary. The paper introduces Sovereign Agentic Loops (SAL), a control-plane architecture that decouples LLM reasoning from execution: models emit structured intents with justifications, which a control plane validates against true system state and policy before any mutation. SAL incorporates an obfuscation membrane for identity isolation and a cryptographically linked Evidence Chain for auditability and deterministic replay. The authors formalize the approach and claim that, under stated assumptions, it guarantees policy-bounded execution, identity isolation, and replayability. In the OpenKedge cloud-infrastructure prototype, SAL blocks 93% of unsafe intents at the policy layer, rejects the remaining 7% via consistency checks, prevents unsafe executions in benchmarks, and incurs 12.4 ms median latency.

Significance. If the formal guarantees hold and the prototype results generalize beyond controlled settings, SAL offers a concrete mechanism for reducing safety risks in agentic LLM systems that interact with real infrastructure. The separation of intent validation from execution, combined with the Evidence Chain for replay, could support auditability and compliance in production environments. The reported low latency suggests the overhead may be acceptable for many use cases, though broader adoption would require demonstrating robustness under realistic concurrency and partial observability.

major comments (2)

[Abstract and formalization] Abstract and formalization: The central claim of policy-bounded execution rests on the control plane validating intents against accurate, current system state before execution. The stated assumptions treat state as reliably observable and static between validation and execution, but this is load-bearing for the safety properties. In asynchronous cloud settings (e.g., OpenKedge), eventual consistency, concurrent updates, or hidden side effects can invalidate the snapshot, allowing an intent validated as safe to become unsafe at execution time. The 93% block rate and benchmark results do not address this gap because they appear to rely on controlled, non-concurrent test scenarios.
[Prototype evaluation] Prototype evaluation: The reported 93% blocking rate, 7% rejection via consistency checks, and prevention of unsafe executions are presented as empirical validation, yet the evaluation setup (controlled benchmarks, non-concurrent scenarios) does not test the weakest assumption of reliable state access. Without experiments under partial observability or concurrent mutations, the results cannot substantiate the claim that SAL prevents unsafe executions in real-world systems.

minor comments (2)

[Abstract] The abstract mentions 'under the stated assumptions' but does not enumerate them explicitly in the provided summary; a dedicated assumptions subsection would improve clarity and allow readers to assess the scope of the guarantees.
[Formalization] Notation for the Evidence Chain and obfuscation membrane should be defined with precise mathematical or pseudocode definitions early in the formalization to support the replay and isolation claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address the major comments point by point below, clarifying the scope of our claims and indicating revisions that will strengthen the presentation of assumptions and evaluation limitations.

read point-by-point responses

Referee: [Abstract and formalization] Abstract and formalization: The central claim of policy-bounded execution rests on the control plane validating intents against accurate, current system state before execution. The stated assumptions treat state as reliably observable and static between validation and execution, but this is load-bearing for the safety properties. In asynchronous cloud settings (e.g., OpenKedge), eventual consistency, concurrent updates, or hidden side effects can invalidate the snapshot, allowing an intent validated as safe to become unsafe at execution time. The 93% block rate and benchmark results do not address this gap because they appear to rely on controlled, non-concurrent test scenarios.

Authors: The formalization section states that all guarantees (policy-bounded execution, identity isolation, and replayability) hold 'under the stated assumptions,' which explicitly include reliable observability of system state at validation time and no intervening mutations. The control plane is modeled as operating on a consistent snapshot provided by the underlying infrastructure. We agree that eventual consistency and concurrency represent important practical considerations not covered by the current assumptions. In the revised manuscript we will add a dedicated 'Assumptions and Limitations' subsection that discusses eventual consistency, concurrent updates, and potential extensions such as intent versioning or optimistic concurrency control. revision: partial
Referee: [Prototype evaluation] Prototype evaluation: The reported 93% blocking rate, 7% rejection via consistency checks, and prevention of unsafe executions are presented as empirical validation, yet the evaluation setup (controlled benchmarks, non-concurrent scenarios) does not test the weakest assumption of reliable state access. Without experiments under partial observability or concurrent mutations, the results cannot substantiate the claim that SAL prevents unsafe executions in real-world systems.

Authors: The prototype evaluation was deliberately conducted in a controlled, non-concurrent setting to isolate the contributions of the policy validation layer, obfuscation membrane, and Evidence Chain. The reported metrics demonstrate that, when the stated assumptions hold, the architecture blocks unsafe intents and adds modest latency. The manuscript does not claim that the 93% figure or benchmark results generalize to fully concurrent production workloads; it presents the OpenKedge implementation as a feasibility study. In revision we will (1) explicitly qualify the evaluation scope in the abstract and evaluation section, (2) add a limitations paragraph discussing the controlled nature of the tests, and (3) outline how the architecture could incorporate state versioning to address concurrency. revision: partial

Circularity Check

0 steps flagged

No significant circularity; formalization and prototype results remain independent of inputs

full rationale

The paper defines SAL as a new control-plane architecture, formalizes its properties under explicitly stated assumptions (policy-bounded execution, identity isolation, deterministic replay), and reports empirical results from the OpenKedge prototype (93% block rate, 12.4 ms latency). No equations, parameters, or claims reduce by construction to fitted values, self-citations, or renamed prior results. The derivation chain relies on the introduced components (obfuscation membrane, Evidence Chain) and external benchmark validation rather than tautological redefinition of inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 3 invented entities

The central claim rests on formalization under unspecified assumptions and the correctness of the prototype implementation; no free parameters or invented physical entities are described.

axioms (1)

domain assumption Under the stated assumptions, SAL provides policy-bounded execution, identity isolation, and deterministic replay.
The formal guarantees are conditioned on assumptions whose details are not provided in the abstract.

invented entities (3)

Sovereign Agentic Loops (SAL) no independent evidence
purpose: Decoupling AI reasoning from execution via intent validation
New named architecture proposed in the paper.
obfuscation membrane no independent evidence
purpose: Limits model access to identity-sensitive state
Invented component to enforce identity isolation.
Evidence Chain no independent evidence
purpose: Cryptographically linked auditability and replay
New mechanism for deterministic replay and auditing.

pith-pipeline@v0.9.0 · 5469 in / 1539 out tokens · 64998 ms · 2026-05-08T11:47:05.709711+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 13 canonical work pages · 8 internal anchors

[1]

Zhou Lin and R. Gupta. Safety-critical llm agents: A survey of runtime architectures.IEEE Transactions on Artificial Intelligence, 2026

2026
[2]

Hendrik W. Bode. Network analysis and feedback amplifier design. 1945

1945
[3]

Karl Johan Åström and Richard Murray.Feedback Systems: An Introduction for Scientists and Engineers. 2010

2010
[4]

OpenKedge: Governing Agentic Mutation with Execution-Bound Safety and Evidence Chains

Jun He and Deying Yu. Openkedge: Governing agentic mutation with execution-bound safety and evidence chains.arXiv preprint arXiv:2604.08601, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[5]

GPT-4 Technical Report

OpenAI. Gpt-4 technical report.arXiv preprint arXiv:2303.08774, 2023. 12

work page internal anchor Pith review arXiv 2023
[6]

Claude: Constitutional ai and harmlessness.arXiv preprint, 2023

Anthropic. Claude: Constitutional ai and harmlessness.arXiv preprint, 2023

2023
[7]

Shunyu et al. Yao. React: Synergizing reasoning and acting in language models.arXiv preprint arXiv:2210.03629, 2022

work page internal anchor Pith review arXiv 2022
[8]

Timo et al. Schick. Toolformer: Language models can teach themselves to use tools.arXiv preprint arXiv:2302.04761, 2023

work page internal anchor Pith review arXiv 2023
[9]

Y . Wang, L. Chen, and C. Martinez. Towards verifiable agentic execution in cloud environ- ments. InProceedings of the 42nd International Conference on Machine Learning (ICML), 2025

2025
[10]

The 2025 AI Agent Index: Documenting Technical and Safety Features of Deployed Agentic AI Systems

Leon Staufer, Kevin Feng, Kevin Wei, Luke Bailey, Yawen Duan, Mick Yang, A. Pinar Ozisik, Stephen Casper, and Noam Kolt. The 2025 ai agent index: Documenting technical and safety features of deployed agentic ai systems.arXiv preprint arXiv:2602.17753, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2025
[11]

Jason et al. Wei. Chain-of-thought prompting elicits reasoning in large language models. NeurIPS, 2022

2022
[12]

Openai function calling api, 2023.https://platform.openai.com/docs

OpenAI. Openai function calling api, 2023.https://platform.openai.com/docs

2023
[13]

Agentspec: Cus- tomizable runtime enforcement for safe and reliable llm agents.arXiv preprint arXiv:2503.18666, 2025

Haoyu Wang, Christopher M. Poskitt, and Jun Sun. Agentspec: Customizable runtime en- forcement for safe and reliable llm agents.arXiv preprint arXiv:2503.18666, 2025. To appear at ICSE 2026

work page arXiv 2025
[14]

ProbGuard: Probabilis- tic Runtime Monitoring for LLM Agent Safety,

Haoyu Wang, Christopher M. Poskitt, and Jun Sun. Pro2guard: Proactive runtime enforcement of llm agent safety via probabilistic model checking.arXiv preprint arXiv:2508.00500, 2025

work page arXiv 2025
[15]

Uchibeke

Open Agent Passport Consortium. Before the tool call: Deterministic pre-action authorization for autonomous ai agents.arXiv preprint arXiv:2603.20953, 2026

work page arXiv 2026
[16]

A brief account of runtime verification.Journal of Logic and Algebraic Programming, 2009

Martin Leucker and Christian Schallhart. A brief account of runtime verification.Journal of Logic and Algebraic Programming, 2009

2009
[17]

Bartocci

Ezio et al. Bartocci. Specification-based monitoring of cyber-physical systems.ACM Comput- ing Surveys, 2018

2018
[18]

Towards Verifiably Safe Tool Use for LLM Agents,

Aarya Doshi, Yining Hong, Congying Xu, Eunsuk Kang, Alexandros Kapravelos, and Chris- tian Kästner. Towards verifiably safe tool use for llm agents.arXiv preprint arXiv:2601.08012,

work page arXiv
[19]

To appear at ICSE-NIER 2026

2026
[20]

Ravi et al. Sandhu. Role-based access control models.IEEE Computer, 1996

1996
[21]

Vincent et al. Hu. Attribute-based access control.IEEE Computer, 2015

2015
[22]

Open policy agent, 2019.https://www.openpolicyagent.org

Styra. Open policy agent, 2019.https://www.openpolicyagent.org

2019
[23]

Cedar policy language, 2023.https://www.cedarpolicy.com

Amazon Web Services. Cedar policy language, 2023.https://www.cedarpolicy.com

2023
[24]

Beyond Static Sandboxing: Learned Capability Governance for Autonomous AI Agents

Bronislav Sidik and Lior Rokach. Beyond static sandboxing: Learned capability governance for autonomous ai agents.arXiv preprint arXiv:2604.11839, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[25]

Why Do Multi-Agent LLM Systems Fail?

Mert Cemri, Melissa Z. Pan, Shuyi Yang, Lakshya A. Agrawal, Bhavya Chopra, Rishabh Ti- wari, Kurt Keutzer, Aditya Parameswaran, Dan Klein, Kannan Ramchandran, Matei Zaharia, Joseph E. Gonzalez, and Ion Stoica. Why do multi-agent llm systems fail?arXiv preprint arXiv:2503.13657, 2025

work page internal anchor Pith review arXiv 2025
[26]

Tiered agentic oversight: A hierarchical multi-agent system for healthcare safety.arXiv preprint arXiv:2506.12482v2, September 2025

Yubin Kim, Hyewon Jeong, Chanwoo Park, Eugene Park, Haipeng Zhang, Xin Liu, Hyeon- hoon Lee, Daniel McDuff, Marzyeh Ghassemi, and Cynthia Breazeal. Tiered agentic oversight: A hierarchical multi-agent system for healthcare safety.arXiv preprint arXiv:2506.12482, 2025

work page arXiv 2025
[27]

Towards a Science of Scaling Agent Systems

Google Research and Google DeepMind. Towards a science of scaling agent systems.arXiv preprint arXiv:2512.08296, 2025. 13

work page internal anchor Pith review Pith/arXiv arXiv 2025
[28]

Calibrating noise to sensitivity in private data analysis.TCC, 2006

Cynthia Dwork. Calibrating noise to sensitivity in private data analysis.TCC, 2006

2006
[29]

Ji-Hoon Kim and S. Patel. Sovereign ai infrastructure: Challenges and opportunities. In Proceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2025. A Notation For clarity, we summarize the notation used throughout the paper. Symbol Description STrue state space of the infrastructure s∈ SConcrete system state ˆSObfusc...

2025