arxiv: 2604.18652 · v1 · submitted 2026-04-20 · 💻 cs.CR · cs.AI

Recognition: unknown

From Craft to Kernel: A Governance-First Execution Architecture and Semantic ISA for Agentic Computers

Changran Xu, Fangxin Liu, Haomin Li, Jianrong Ding, Li Jiang, Lingjun Chen, Qiang Xu, Shu Chi, XiangYu Wen, Xiaoyu Xu, Yuang Zhao, Zeju Li

Pith reviewed 2026-05-10 04:48 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords agentic AIexecution architecturesemantic ISAsecurity contexttaint propagationneuro-symbolic kernelgovernance-first designinstruction dependency graph

0 comments

The pith

Arbiter-K enforces security as a microarchitectural property in agentic AI by reifying model outputs through a Semantic ISA into taint-tracked discrete instructions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that agentic AI remains brittle because control loops are delegated to large language models and then patched with heuristic guardrails. It proposes a Governance-First architecture called Arbiter-K that treats the model as a Probabilistic Processing Unit inside a deterministic neuro-symbolic kernel. The kernel uses a Semantic ISA to convert probabilistic messages into discrete instructions, builds an Instruction Dependency Graph at runtime, and propagates taints according to each node's data-flow pedigree. This setup lets the system block unsafe actions at deterministic points such as high-risk tool calls or unauthorized egress, and it supports automatic correction and rollback. A reader would care if the claim holds because it offers a concrete route from fragile prototypes to production-grade agentic systems without relying on after-the-fact patches.

Core claim

Arbiter-K reconceptualizes the underlying model as a Probabilistic Processing Unit encapsulated by a deterministic, neuro-symbolic kernel. It implements a Semantic Instruction Set Architecture to reify probabilistic messages into discrete instructions. This allows the kernel to maintain a Security Context Registry and construct an Instruction Dependency Graph at runtime, enabling active taint propagation based on the data-flow pedigree of each reasoning node. The kernel then precisely interdicts unsafe trajectories at deterministic sinks and performs autonomous execution correction and architectural rollback when policies are triggered.

What carries the argument

The Semantic Instruction Set Architecture that converts probabilistic model outputs into discrete instructions, together with the runtime Instruction Dependency Graph that supports taint propagation from data-flow pedigrees.

If this is right

Unsafe trajectories are interdicted at precise deterministic sinks such as high-risk tool calls or unauthorized network egress.
The kernel can trigger autonomous execution correction and architectural rollback when a security policy is activated.
Security becomes enforceable as a microarchitectural property rather than an external heuristic layer.
Evaluations on OpenClaw and NanoBot show interception rates between 76% and 95%, producing a 92.79% absolute gain over native policies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The deterministic kernel layer could be combined with conventional operating-system isolation primitives to create defense-in-depth for agentic workloads.
Custom policy definitions at the Semantic ISA level would let domain experts express governance rules without modifying the underlying model.
Similar reification and dependency-graph techniques might apply to other settings where probabilistic components must interface with deterministic safety constraints.

Load-bearing premise

Probabilistic outputs from the underlying model can be reliably and losslessly reified into discrete instructions via the Semantic ISA without introducing new failure modes or missing critical context.

What would settle it

An evaluation run in which an unsafe action (such as an unauthorized network call) completes without interception because the Semantic ISA reification omitted or distorted key context from the model's probabilistic output.

Figures

Figures reproduced from arXiv: 2604.18652 by Changran Xu, Fangxin Liu, Haomin Li, Jianrong Ding, Li Jiang, Lingjun Chen, Qiang Xu, Shu Chi, XiangYu Wen, Xiaoyu Xu, Yuang Zhao, Zeju Li.

**Figure 1.** Figure 1: An example of a “Semantic Deviation” where a subtle prompt injection in a multi-step task leads to an unauthorized tool call. apply IAM principles to gate externally visible actions. However, internal probabilistic transitions remain unverified because policies do not constrain the generation process. Integrated monitors including AgentSafe [10] and related frameworks [20] utilize risk taxonomies to m… view at source ↗

**Figure 2.** Figure 2: Cost profile under increasing task length: cumulative waste from repeated abort-and-retry behavior. These results expose a limitation of orchestration-only safety mechanisms: they lack visibility into execution semantics. When model outputs remain opaque text, the system cannot attribute how earlier untrusted inputs shape later high-impact actions. Governance therefore remains reactive and typically act… view at source ↗

**Figure 4.** Figure 4: Architecture of Arbiter-K. 5 Arbiter-K Design 5.1 Discrete Instruction Set Architecture A neuro-symbolic architecture predicated on the analogy of a PPU necessitates a well-defined Instruction Set Architecture (ISA). The ISA serves as the formal contract that drives the PPU and supports the kernel runtime. As illustrated in [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: , we arrange the ISA into five logical cores, where each governs a distinct functional domain of the agent runtime. These cores provide a structured framework for managing everything from probabilistic reasoning to deterministic safety enforcement. As summarized in [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Example and procedures of taint analysis. layer reifies abstract instructions into executable units by explicitly mapping implementation logic to specific instruction types and enforcing structural constraints, as presented in the following code snippet. By establishing these mappings through a dedicated binding interface, the architecture ensures that every instruction operates within a predefined fun… view at source ↗

**Figure 7.** Figure 7: Performance of Arbiter-K [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 9.** Figure 9: shows that Arbiter-K’s security gains primarily come from its semantic policy layers rather than from host-specific rules alone. OpenClawPolicy by itself (O) preserves most benign executions but intercepts only 6.2% of unsafe cases, indicating that handcrafted host rules are insufficient as the main line of defense. In contrast, RelationalPolicy (R) and UnaryGatePolicy (U) substantially improve unsafe in… view at source ↗

**Figure 8.** Figure 8: Instruction coverage of Arbiter-K [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗

read the original abstract

The transition of agentic AI from brittle prototypes to production systems is stalled by a pervasive crisis of craft. We suggest that the prevailing orchestration paradigm-delegating the system control loop to large language models and merely patching with heuristic guardrails-is the root cause of this fragility. Instead, we propose Arbiter-K, a Governance-First execution architecture that reconceptualizes the underlying model as a Probabilistic Processing Unit encapsulated by a deterministic, neuro-symbolic kernel. Arbiter-K implements a Semantic Instruction Set Architecture (ISA) to reify probabilistic messages into discrete instructions. This allows the kernel to maintain a Security Context Registry and construct an Instruction Dependency Graph at runtime, enabling active taint propagation based on the data-flow pedigree of each reasoning node. By leveraging this mechanism, Arbiter-K precisely interdicts unsafe trajectories at deterministic sinks (e.g., high-risk tool calls or unauthorized network egress) and enables autonomous execution correction and architectural rollback when security policies are triggered. Evaluations on OpenClaw and NanoBot demonstrate that Arbiter-K enforces security as a microarchitectural property, achieving 76% to 95% unsafe interception for a 92.79% absolute gain over native policies. The code is publicly available at https://github.com/cure-lab/ArbiterOS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Arbiter-K wraps LLMs in a deterministic kernel with semantic ISA and taint tracking for agent safety, reporting strong interception gains, but the reification step from probabilistic outputs to discrete instructions is unvalidated and load-bearing.

read the letter

The paper's main contribution is a governance-first kernel called Arbiter-K that treats the underlying model as a Probabilistic Processing Unit and enforces security through a Semantic ISA, runtime Instruction Dependency Graphs, and active taint propagation. This lets the system interdict unsafe tool calls or egress at deterministic sinks and supports rollback. On OpenClaw and NanoBot it claims 76-95% unsafe interception for a 92.79% absolute gain over native policies, with public code at the GitHub link.

Referee Report

3 major / 2 minor

Summary. The paper proposes Arbiter-K, a governance-first execution architecture that treats the underlying LLM as a Probabilistic Processing Unit encapsulated by a deterministic neuro-symbolic kernel. It introduces a Semantic Instruction Set Architecture (ISA) to reify probabilistic outputs into discrete instructions, populating a Security Context Registry and constructing an Instruction Dependency Graph for runtime taint propagation. This enables deterministic interdiction of unsafe trajectories at sinks such as high-risk tool calls. Evaluations on OpenClaw and NanoBot report 76%–95% unsafe interception rates, for a 92.79% absolute gain over native policies; the code is released publicly.

Significance. If the central claims hold, the work would represent a meaningful shift from heuristic guardrails to microarchitectural security enforcement in agentic systems, potentially improving reliability for production autonomous agents. Public code availability supports reproducibility and is a clear strength.

major comments (3)

[Evaluation] Evaluation section (and abstract): the reported 76%–95% interception rates and 92.79% gain lack any description of experimental setup, test-case count, baseline implementations, or controls for confounds. Without these, the empirical support for the central security claim cannot be assessed.
[Architecture] Semantic ISA reification step (abstract and § on architecture): the lossless conversion of probabilistic LLM outputs into discrete instructions that populate the Security Context Registry and Instruction Dependency Graph is load-bearing for all taint-based interdiction claims, yet no fidelity metrics, error rates, coverage statistics, or stress tests on context loss or misclassification are provided.
[Architecture] Instruction Dependency Graph construction (architecture description): the paper does not address how the graph is built or maintained when reification is incomplete or ambiguous, which directly affects the reliability of data-flow pedigree tracking and deterministic sink interdiction.

minor comments (2)

Notation for the Semantic ISA and its mapping to the kernel is introduced without a formal definition or example instruction format, making the reification process hard to follow.
The abstract states the code is publicly available but does not include the repository URL in the main text body; ensure consistency.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed review. The comments identify important gaps in the empirical and architectural descriptions that we will address through targeted revisions. Below we respond point by point to each major comment.

read point-by-point responses

Referee: [Evaluation] Evaluation section (and abstract): the reported 76%–95% interception rates and 92.79% gain lack any description of experimental setup, test-case count, baseline implementations, or controls for confounds. Without these, the empirical support for the central security claim cannot be assessed.

Authors: We agree that the current evaluation section provides insufficient detail to allow independent assessment of the reported interception rates and performance gains. In the revised manuscript we will expand the evaluation section (and update the abstract accordingly) to include: the total number of test cases and scenarios drawn from the OpenClaw and NanoBot benchmarks; explicit descriptions of the baseline implementations (native LLM-driven agent policies without the Arbiter-K kernel); and the controls used for potential confounds such as prompt phrasing, temperature settings, and environmental variability. These additions will be supported by the publicly released code. revision: yes
Referee: [Architecture] Semantic ISA reification step (abstract and § on architecture): the lossless conversion of probabilistic LLM outputs into discrete instructions that populate the Security Context Registry and Instruction Dependency Graph is load-bearing for all taint-based interdiction claims, yet no fidelity metrics, error rates, coverage statistics, or stress tests on context loss or misclassification are provided.

Authors: The referee is correct that quantitative validation of the reification step is necessary to support the downstream security claims. While the architecture section describes the Semantic ISA conceptually, we did not report supporting metrics. We will add a new subsection that presents fidelity metrics, reification error rates, coverage statistics, and stress-test results on context loss and misclassification, using both the existing evaluation traces and additional analysis performed on the released codebase. revision: yes
Referee: [Architecture] Instruction Dependency Graph construction (architecture description): the paper does not address how the graph is built or maintained when reification is incomplete or ambiguous, which directly affects the reliability of data-flow pedigree tracking and deterministic sink interdiction.

Authors: We acknowledge that the manuscript does not explicitly describe graph construction and maintenance under incomplete or ambiguous reification. This omission affects the claimed reliability of taint propagation. In the revised architecture section we will add a description of the fallback mechanisms employed, including conservative default tainting rules, ambiguity-resolution heuristics, and how these choices preserve deterministic sink interdiction even when reification is imperfect. revision: yes

Circularity Check

0 steps flagged

No circularity; architecture proposal and empirical results are self-contained

full rationale

The paper advances a governance-first architecture (Arbiter-K) with a Semantic ISA for reifying LLM outputs into discrete instructions, a Security Context Registry, and Instruction Dependency Graph for taint-based interdiction. Central claims rest on this design plus reported evaluation metrics (76-95% interception, 92.79% gain) on OpenClaw and NanoBot. No equations, fitted parameters renamed as predictions, self-citations invoked as uniqueness theorems, or ansatzes smuggled via prior work appear in the abstract or description. The derivation chain does not reduce to its inputs by construction; the reification step is an explicit design choice whose fidelity is left to empirical validation rather than assumed by definition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on treating LLMs as probabilistic processors that can be encapsulated deterministically and on the ability to map outputs to a semantic ISA without loss. No free parameters are explicitly fitted in the abstract. New entities are introduced without independent evidence outside the proposal.

axioms (1)

domain assumption LLM outputs can be accurately reified into discrete instructions by the Semantic ISA without significant information loss or new vulnerabilities.
This is the foundational premise enabling the kernel's deterministic control over probabilistic behavior.

invented entities (2)

Semantic Instruction Set Architecture (ISA) no independent evidence
purpose: To convert probabilistic messages into discrete, trackable instructions for the kernel.
New construct proposed to bridge probabilistic AI and deterministic security enforcement.
Instruction Dependency Graph no independent evidence
purpose: To enable runtime taint propagation and dependency tracking for security decisions.
Introduced as part of the kernel's active monitoring mechanism.

pith-pipeline@v0.9.0 · 5561 in / 1320 out tokens · 43484 ms · 2026-05-10T04:48:32.413764+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 14 canonical work pages · 5 internal anchors

[1]

Amazon. 2025. Amazon Bedrock AgentCore Policy: Control Agent-to-Tool Interactions. https://docs.aws.amazon.com/bedrock- agentcore/latest/devguide/policy.html

2025
[2]

Anthropic. 2025. Equipping agents for the real world with Agent Skills. https://www.anthropic.com/engineering/equipping-agents- for-the-real-world- with-agent-skills

2025
[3]

Xiaohe Bo, Zeyu Zhang, Quanyu Dai, Xueyang Feng, Lei Wang, Rui Li, Xu Chen, and Ji-Rong Wen. 2024. Reflective Multi-Agent Collabo- ration based on Large Language Models. In Proceedings of NeurIPS

2024
[4]

Edoardo Debenedetti, Jie Zhang, Mislav Balunovi’c, Luca Beurer- Kellner, Marc Fischer, and Florian Tramèr. 2024. AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents. ArXiv abs/2406.13352 (2024). https://api.semanticscholar. org/CorpusID:270619628

work page internal anchor Pith review arXiv 2024
[5]

Hassen Dhrif. 2025. Reasoning-Aware Prompt Orchestration: A Foun- dation Model for Multi-Agent Language Model Coordination. ArXiv abs/2510.00326 (2025)

work page arXiv 2025
[6]

Invariant Labs. 2025. Invariant Guardrails. https://github.com/invariantlabs-ai/invariant

2025
[7]

IronClaw Contributors. 2026. IronClaw: Your secure personal AI as- sistant, always on your side. https://github.com/nearai/ironclaw

2026
[8]

Yang JingYi, Shuai Shao, Dongrui Liu, and Jing Shao. 2025. RiOS- World: Benchmarking the Risk of Multimodal Computer-Use Agents. In NeurIPS

2025
[9]

Edward Junprung. 2023. Exploring the Intersection of Large Lan- guage Models and Agent-Based Modeling via Prompt Engineering. ArXiv abs/2308.07411 (2023)

work page arXiv 2023
[10]

Rafflesia Khan, Declan Joyce, and Mansura Habiba. 2025. AGENTSAFE: A Unified Framework for Ethical Assurance and Governance in Agentic AI. arXiv abs/2512.03180 (2025)

work page arXiv 2025
[11]

Thomas Kuntz, Agatha Duzan, Hao Zhao, Francesco Croce, J Zico Kolter, Nicolas Flammarion, and Maksym Andriushchenko. 2025. OS- Harm: A Benchmark for Measuring Safety of Computer Use Agents. In NeurIPS

2025
[12]

Puzhuo Liu, Chengnian Sun, Yaowen Zheng, Xuan Feng, Chuan Qin, Yuncheng Wang, Zhi Li, and Limin Sun. 2023. Harnessing the Power of LLM to Support Binary Taint Analysis. ArXiv abs/2310.08275 (2023)

work page arXiv 2023
[13]

Songyang Liu, Chaozhuo Li, Chenxu Wang, Jinyu Hou, Zejian Chen, Litian Zhang, Zheng Liu, Qiwei Ye, Yiming Hei, Xi Zhang, and Zhongyuan Wang. 2026. ClawKeeper: Comprehensive Safety Pro- tection for OpenClaw Agents Through Skills, Plugins, and Watchers. ArXiv abs/2603.24414 (2026)

work page arXiv 2026
[14]

Lingrui Mei, Jiayu Yao, Yuyao Ge, Yiwei Wang, Baolong Bi, Yujun Cai, Jiazhi Liu, Mingyu Li, Zhong-Zhi Li, Duzhen Zhang, Chenlin Zhou, Jiayi Mao, Tianze Xia, Jiafeng Guo, and Shenghua Liu. 2025. A Survey of Context Engineering for Large Language Models. ArXiv abs/2507.13334 (2025)

work page internal anchor Pith review arXiv 2025
[15]

NanoBot Contributors. 2026. NanoBot: Ultra-Lightweight Personal AI Agent. https://github.com/HKUDS/nanobot

2026
[16]

OpenClaw Contributors. 2026. OpenClaw: Open-Source AI Agent Runtime. https://github.com/openclaw/openclaw

2026
[17]

Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Sam- rat Sohel Mondal, and Aman Chadha. 2024. A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Ap- plications. ArXiv abs/2402.07927 (2024)

work page internal anchor Pith review arXiv 2024
[18]

URL: https: //arxiv.org/abs/2406.06608.arXiv:2406.06608

Sander Schulhoff, Michael Ilie, Nishant Balepur, Konstantine Ka- hadze, Amanda Liu, Chenglei Si, Yinheng Li, Aayush Gupta, Hyo- Jung Han, Sevien Schulhoff, Pranav Sandeep Dulepet, Saurav Vidyad- hara, Dayeon Ki, Sweta Agrawal, Chau Minh Pham, Gerson C. Kroiz, Feileen Li, Hudson Tao, Ashay Srivastava, Hevander Da Costa, Sa- loni Gupta, Megan L. Rogers, Inn...

work page arXiv 2024
[19]

Ava Spataru, Eric Hambro, Elena Voita, and Nicola Cancedda. 2024. Know When To Stop: A Study of Semantic Drift in Text Generation. In Proceedings of NAACL. 3656–3671

2024
[20]

MI9 – agent intelligence protocol: Runtime governance for agentic AI systems,

Charles L. Wang, Trisha Singhal, Ameya Kelkar, and Jason Tuo. 2025. MI9: An Integrated Runtime Governance Framework for Agentic AI. ArXiv abs/2508.03858 (2025)

work page arXiv 2025
[21]

Shuyue Wang, Runxin Xu, Zirui Zhu, Zekun Wu, Chen Zhang, Weize Liu, Zheyuan Liu, Yushi Qin, Yiran Yang, Yuan Zhang, et al. 2024. TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks. ArXiv abs/2405.06451 (2024)

work page arXiv 2024
[22]

Bin Xu. 2026. AI Agent Systems: Architectures, Applications, and Evaluation. ArXiv abs/2601.01743 (2026)

work page arXiv 2026
[23]

Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, and Yongfeng Zhang. 2025. Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents. In ICLR

2025
[24]

Zhexin Zhang, Shiyao Cui, Yida Lu, Jingzhuo Zhou, Junxiao Yang, Hongning Wang, and Minlie Huang. 2024. Agent-SafetyBench: Eval- uating the Safety of LLM Agents. ArXiv abs/2412.14470 (2024). https: //api.semanticscholar.org/CorpusID:274859514

work page internal anchor Pith review arXiv 2024
[25]

Wei Zhao, Zhe Li, Peixin Zhang, and Jun Sun. 2026. ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection. ArXiv abs/2604.11790v1 (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026