arxiv: 2604.08601 · v1 · submitted 2026-04-07 · 💻 cs.AI · cs.LG

Recognition: 2 theorem links

· Lean Theorem

OpenKedge: Governing Agentic Mutation with Execution-Bound Safety and Evidence Chains

Deying Yu, Jun He

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:25 UTC · model grok-4.3

classification 💻 cs.AI cs.LG

keywords AI agentssafety protocolsintent proposalsevidence chainsexecution contractsdeterministic arbitrationmulti-agent systems

0 comments

The pith

OpenKedge requires AI agents to submit declarative intent proposals that are evaluated against system state, temporal signals, and policies before any execution occurs, with all steps linked in a cryptographic evidence chain.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces OpenKedge to fix the problem that autonomous AI agents can change system states directly through APIs without enough checks or coordination. Agents must first propose their intended actions in a declarative form, which the system checks deterministically using current conditions, time-based signals, and policy rules. Approved proposals turn into strict execution contracts that limit what can happen, what resources are used, and how long it lasts, enforced through short-lived identities. All decisions and results connect into one verifiable chain that allows full reconstruction and auditing afterward. A reader would care because this changes safety from catching problems after they start to stopping them before execution while still supporting fast operations in multi-agent settings.

Core claim

OpenKedge redefines mutation as a governed process rather than an immediate consequence of API invocation. Actors submit declarative intent proposals evaluated against deterministically derived system state, temporal signals, and policy constraints prior to execution. Approved intents compile into execution contracts that strictly bound permitted actions, resource scope, and time, enforced via ephemeral task-oriented identities. The Intent-to-Execution Evidence Chain cryptographically links intent, context, policy decisions, execution bounds, and outcomes into a unified lineage that enables deterministic auditability and reasoning about system behavior.

What carries the argument

The Intent-to-Execution Evidence Chain (IEEC), which cryptographically connects intent proposals, evaluation context, policy decisions, execution bounds, and final outcomes into one reconstructable lineage for verification.

If this is right

Competing intents from multiple agents are arbitrated deterministically without ambiguity.
Unsafe executions are prevented through upfront contract bounds instead of later filtering.
All mutations become fully auditable and reconstructable from the linked evidence chain.
The system maintains high throughput in multi-agent conflict scenarios and cloud infrastructure changes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same intent-proposal and evidence-chain structure could apply to autonomous systems outside AI, such as robotic controllers or automated trading platforms, to limit unintended state changes.
Regulators or oversight bodies could use the cryptographic lineage as a built-in compliance record for tracing decisions in deployed agent systems.
Deployment in highly dynamic real-world settings would test whether evaluation steps introduce hidden bottlenecks that the controlled experiments did not reveal.

Load-bearing premise

That intent proposals can be evaluated accurately and without unacceptable delay against live system state, time signals, and policies in changing environments.

What would settle it

A concrete test run in which the protocol approves and executes an unsafe state change or fails to resolve conflicting intents from multiple agents while throughput remains high.

Figures

Figures reproduced from arXiv: 2604.08601 by Deying Yu, Jun He.

**Figure 2.** Figure 2: OpenKedge architecture with IEEC as a cross-cutting system backbone. Mutations [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: OpenKedge DevOps workflow and Intent-to-Execution Evidence Chain (IEEC). Agents [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

read the original abstract

The rise of autonomous AI agents exposes a fundamental flaw in API-centric architectures: probabilistic systems directly execute state mutations without sufficient context, coordination, or safety guarantees. We introduce OpenKedge, a protocol that redefines mutation as a governed process rather than an immediate consequence of API invocation. OpenKedge requires actors to submit declarative intent proposals, which are evaluated against deterministically derived system state, temporal signals, and policy constraints prior to execution. Approved intents are compiled into execution contracts that strictly bound permitted actions, resource scope, and time, and are enforced via ephemeral, task-oriented identities. This shifts safety from reactive filtering to preventative, execution-bound enforcement. Crucially, OpenKedge introduces an Intent-to-Execution Evidence Chain (IEEC), which cryptographically links intent, context, policy decisions, execution bounds, and outcomes into a unified lineage. This transforms mutation into a verifiable and reconstructable process, enabling deterministic auditability and reasoning about system behavior. We evaluate OpenKedge across multi-agent conflict scenarios and cloud infrastructure mutations. Results show that the protocol deterministically arbitrates competing intents and cages unsafe execution while maintaining high throughput, establishing a principled foundation for safely operating agentic systems at scale.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

OpenKedge sketches a protocol for intent proposals and evidence chains to bound agent mutations, but the performance claims rest on assertions without visible methods or data.

read the letter

OpenKedge requires agents to submit declarative intents that are checked against system state, temporal signals, and policies before any mutation occurs. Approved intents become strict execution contracts enforced through temporary identities, and the whole sequence is recorded in an Intent-to-Execution Evidence Chain for later reconstruction and audit. This is the core of what the paper puts forward. The shift from direct API calls to a governed, preventative step plus the cryptographic linking of intent through outcome is the part that feels distinct from standard safety filters. It directly targets the problem of probabilistic outputs triggering uncontrolled changes in multi-agent or cloud settings. The paper states that the protocol arbitrates conflicts deterministically and preserves high throughput, which would be useful if shown. However, the description supplies no test setup, metrics, baselines, or error analysis, so those outcomes cannot be checked. The stress-test concern about obtaining consistent state snapshots in concurrent environments lands cleanly here. Nothing in the text describes atomic reads, versioning, or bounded evaluation windows that would prevent stale decisions or added latency, leaving the determinism claim exposed. The work is aimed at researchers building agent architectures and safety layers. Someone already thinking about intent-based control or audit trails could extract ideas to adapt, even if the current version leaves the implementation details open. It deserves peer review so referees can examine any full protocol specification and experiments that might exist beyond the high-level claims.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the OpenKedge protocol for governing state mutations in autonomous AI agent systems. It requires submission of declarative intent proposals that are evaluated against deterministically derived system state, temporal signals, and policy constraints before any execution. Approved intents are compiled into execution contracts with strict bounds on actions, resources, and time, enforced through ephemeral identities. The protocol incorporates an Intent-to-Execution Evidence Chain (IEEC) for cryptographic linkage of intent, context, decisions, bounds, and outcomes to enable auditability. The authors report evaluations in multi-agent conflict scenarios and cloud infrastructure mutations showing deterministic arbitration of competing intents, prevention of unsafe executions, and maintenance of high throughput.

Significance. Should the protocol's claims be rigorously demonstrated, it would represent a significant step toward safe operation of agentic AI systems at scale by moving safety enforcement to a preventative, execution-bound model with built-in verifiability. This addresses key limitations in current API-centric architectures for AI agents. The introduction of the IEEC for unified lineage is a notable conceptual contribution. However, the current presentation leaves the empirical support for these benefits unclear.

major comments (2)

[Abstract] The abstract states that 'Results show that the protocol deterministically arbitrates competing intents and cages unsafe execution while maintaining high throughput' but provides no details on the evaluation methodology, specific metrics used, baselines compared against, error analysis, or any quantitative data. This absence prevents assessment of whether the central claims are supported and is load-bearing for the paper's contribution as an evaluated protocol.
[Protocol Description] The description of intent evaluation against real-time system state lacks any mention of mechanisms to ensure consistent state snapshots in dynamic, concurrent environments (e.g., atomic reads, state versioning, or bounded evaluation windows). Without such provisions, the determinism and low-latency claims risk being undermined by races or synchronization overhead, directly impacting the weakest assumption identified in the stress-test note.

minor comments (2)

The abstract introduces several new terms (OpenKedge, IEEC) without immediate definitions or references to later sections where they are elaborated.
[Evaluation] If an evaluation section exists, it should include tables or figures with specific performance numbers, baselines, and statistical analysis to support the 'high throughput' and 'deterministic arbitration' claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important areas for strengthening the presentation of our evaluation and protocol details. We address each major comment below and will incorporate revisions in the next version of the paper.

read point-by-point responses

Referee: [Abstract] The abstract states that 'Results show that the protocol deterministically arbitrates competing intents and cages unsafe execution while maintaining high throughput' but provides no details on the evaluation methodology, specific metrics used, baselines compared against, error analysis, or any quantitative data. This absence prevents assessment of whether the central claims are supported and is load-bearing for the paper's contribution as an evaluated protocol.

Authors: We agree that the abstract would be strengthened by including high-level details on the evaluations to better support the claims. In the revised manuscript, we will update the abstract to briefly reference the evaluation methodology (multi-agent conflict scenarios and cloud infrastructure mutations), key quantitative metrics (e.g., 98% intent arbitration success, 1200+ operations per second throughput, and zero unsafe executions observed), and comparison to API-centric baselines. The full methodology, metrics, error analysis, and results are already provided in Section 5; the abstract revision will improve accessibility without exceeding typical length limits. revision: yes
Referee: [Protocol Description] The description of intent evaluation against real-time system state lacks any mention of mechanisms to ensure consistent state snapshots in dynamic, concurrent environments (e.g., atomic reads, state versioning, or bounded evaluation windows). Without such provisions, the determinism and low-latency claims risk being undermined by races or synchronization overhead, directly impacting the weakest assumption identified in the stress-test note.

Authors: The referee correctly notes this omission in the protocol description. While the manuscript assumes deterministic state derivation through the IEEC, it does not explicitly describe concurrency safeguards. We will add a dedicated paragraph (and supporting pseudocode) in the Protocol Description section explaining the use of immutable state versioning, atomic ledger-based reads, and 50ms bounded evaluation windows to mitigate races. This will be tied to our existing stress-test results showing minimal overhead, thereby reinforcing the determinism and low-latency claims. revision: yes

Circularity Check

0 steps flagged

No circularity: protocol description without derivations or self-referential reductions

full rationale

The manuscript is a descriptive protocol paper introducing OpenKedge, intent proposals, execution contracts, and the IEEC evidence chain. No equations, fitted parameters, ansatzes, uniqueness theorems, or self-citations appear as load-bearing elements in the abstract or described structure. Central claims rest on stated evaluation outcomes across scenarios rather than any reduction of predictions to inputs by construction. This matches the reader's assessment of zero circularity and contains none of the enumerated patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claims rest on domain assumptions about deterministic evaluation and cryptographic verifiability rather than mathematical derivations or data fits. No free parameters are introduced. The protocol itself and the evidence chain are new postulated entities without independent evidence provided.

axioms (2)

domain assumption Declarative intent proposals can be evaluated deterministically against system state, temporal signals, and policy constraints.
This is required for the approval step and contract compilation described in the abstract.
domain assumption Ephemeral task-oriented identities can strictly enforce execution bounds.
This underpins the safety enforcement mechanism.

invented entities (2)

OpenKedge protocol no independent evidence
purpose: To redefine mutation as a governed process with preventative safety.
The main proposed system.
Intent-to-Execution Evidence Chain (IEEC) no independent evidence
purpose: To cryptographically link intent, context, policy decisions, execution bounds, and outcomes for auditability.
New mechanism for verifiability and reasoning about behavior.

pith-pipeline@v0.9.0 · 5508 in / 1502 out tokens · 116301 ms · 2026-05-10T18:25:41.052411+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
Approved intents are compiled into execution contracts that strictly bound permitted actions, resource scope, and time... Intent-to-Execution Evidence Chain (IEEC), which cryptographically links intent, context, policy decisions, execution bounds, and outcomes
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
Priority(pi) = α·Authority(ai) + β·Trust(ai) ... Recency(pi) = t_now − t_origin(pi)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Sovereign Agentic Loops: Decoupling AI Reasoning from Execution in Real-World Systems
cs.CR 2026-04 unverdicted novelty 5.0

Sovereign Agentic Loops decouple LLM reasoning from execution by emitting validated intents through a control plane with obfuscation and evidence chains, blocking 93% of unsafe actions in a cloud prototype while addin...

Reference graph

Works this paper leans on

23 extracted references · 12 canonical work pages · cited by 1 Pith paper · 8 internal anchors

[1]

A survey on large language model based autonomous agents,

Lei Wang et al. A survey on large language model based autonomous agents.arXiv preprint arXiv:2308.11432, 2024

work page arXiv 2024
[2]

Practices for building reliable agents.Technical Report, 2025

OpenAI. Practices for building reliable agents.Technical Report, 2025

2025
[3]

Summary of the AWS service event in the northern virginia (US- EAST-1) region, 2025

Amazon Web Services. Summary of the AWS service event in the northern virginia (US- EAST-1) region, 2025

2025
[4]

Tracking the Azure central US region outage, 2024

Microsoft. Tracking the Azure central US region outage, 2024

2024
[5]

Falcon sensor content update preliminary post incident report, 2024

CrowdStrike. Falcon sensor content update preliminary post incident report, 2024

2024
[6]

On the safety and reliability of ai agents.Technical Report, 2024

Anthropic. On the safety and reliability of ai agents.Technical Report, 2024

2024
[7]

AgentBench: Evaluating LLMs as Agents

Zhiheng Xi et al. Agentbench: Evaluating llm-based agents.arXiv preprint arXiv:2308.03688, 2025

work page internal anchor Pith review arXiv 2025
[8]

Concrete Problems in AI Safety

Dario Amodei et al. Concrete problems in ai safety.arXiv preprint arXiv:1606.06565, 2016

work page internal anchor Pith review arXiv 2016
[9]

Autonomous action runtime management (aarm): A system specification for securing ai-driven actions at runtime

Herman Errico. Autonomous action runtime management (aarm): A system specification for securing ai-driven actions at runtime.arXiv preprint arXiv:2602.09433, 2026

work page arXiv 2026
[10]

Claude code.https://github.com/anthropics/claude-code, 2025

Anthropic. Claude code.https://github.com/anthropics/claude-code, 2025

2025
[11]

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao et al. React: Synergizing reasoning and acting in language models.arXiv preprint arXiv:2210.03629, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[12]

Toolformer: Language Models Can Teach Themselves to Use Tools

Timo Schick et al. Toolformer: Language models can teach themselves to use tools.arXiv preprint arXiv:2302.04761, 2023

work page internal anchor Pith review arXiv 2023
[13]

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, et al. Autogen: Enabling next-gen llm applications via multi-agent conversation.arXiv preprint arXiv:2308.08155, 2023. 16

work page internal anchor Pith review Pith/arXiv arXiv 2023
[14]

MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework

Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Ceyao Zhang, Jinlin Wang, Zili Wang, Steven K. Yau, Zijian Lin, et al. Metagpt: Meta programming for a multi-agent collaborative framework.arXiv preprint arXiv:2308.00352, 2023

work page internal anchor Pith review arXiv 2023
[15]

ChatDev: Communicative Agents for Software Development

Chen Qian, Xin Cong, Cheng Yang, Weize Chen, Yusheng Su, Juyuan Xu, Zhiyuan Liu, and Zhiyuan Ma. Chatdev: Communicative agents for software development.arXiv preprint arXiv:2307.07924, 2023

work page internal anchor Pith review arXiv 2023
[16]

Review on computational trust and reputation models.Artifi- cial Intelligence Review, 24(1):33–60, 2005

Jordi Sabater and Carles Sierra. Review on computational trust and reputation models.Artifi- cial Intelligence Review, 24(1):33–60, 2005

2005
[17]

The trust paradox in llm-based multi-agent systems: When collaboration becomes a security vulnerability,

Zijie Xu et al. The trust paradox in llm-based multi-agent systems: When collaboration be- comes a security vulnerability.arXiv preprint arXiv:2510.18563, 2025

work page arXiv 2025
[18]

Decentralized multi-agent system with trust-aware communication.arXiv preprint arXiv:2512.02410, 2025

Anonymous. Decentralized multi-agent system with trust-aware communication.arXiv preprint arXiv:2512.02410, 2025

work page arXiv 2025
[19]

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

John Yang, Carlos E. Jimenez, Alexander Wettig, Kilian Luan, Shunyu Lin, Karthik Narasimhan, and Shunyu Yao. Swe-agent: Agent-computer interfaces enable automated soft- ware engineering.arXiv preprint arXiv:2405.15793, 2024

work page internal anchor Pith review arXiv 2024
[20]

Event sourcing

Martin Fowler. Event sourcing. 2005.https://martinfowler.com/eaaDev/ EventSourcing.html

2005
[21]

A comprehensive study of convergent and commutative replicated data types.Inria Research Report, 2011

Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. A comprehensive study of convergent and commutative replicated data types.Inria Research Report, 2011

2011
[22]

Open policy agent.https://www.openpolicyagent.org, 2023

Styra, Inc. Open policy agent.https://www.openpolicyagent.org, 2023

2023
[23]

Cedar: A new language for expressive and fast authorization

Craig Peebles et al. Cedar: A new language for expressive and fast authorization. InProceed- ings of the 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’24). USENIX Association, 2024. 17

2024