arxiv: 2604.16838 · v2 · submitted 2026-04-18 · 💻 cs.CR · cs.AI· cs.MA

Recognition: no theorem link

enclawed: A Configurable, Sector-Neutral Hardening Framework for Single-User AI Assistant Gateways

Alfredo Metere

Pith reviewed 2026-05-12 01:46 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.MA

keywords AI gateway hardeningsigned proof bundlestamper-evident auditdeny-by-default connectivityverification latticeextension admission gateregulated AI deploymentprompt injection defense

0 comments

The pith

Enclawed hardens single-user AI gateways by adding signed proof bundles, deny-by-default connectivity, and a closed four-level verification lattice on top of OpenClaw.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents enclawed as a configurable hardening framework forked from the OpenClaw AI assistant gateway. It supplies two versions: an open flavor that emits audit and classification signals while staying compatible, and an enclaved flavor that enforces strict allowlists, FIPS crypto assertions, mandatory signature checks, and high-assurance peer attestation. A data-driven classification ladder with five presets or custom JSON, a 356-case test suite covering tamper detection through prompt injection, and a biconditional extension-admission gate are included. The work closes the top of a four-level verification lattice using skill-formal primitives plus a CLI that emits signed proof-carrying bundles the runtime re-validates at load. The framework targets regulated single-user deployments that require attestable trust without claiming to replace full certification or hardware validation.

Core claim

enclawed supplies two flavors of an OpenClaw hard-fork that together activate signed-module loading, tamper-evident audit trails, a memory-bounded transaction buffer with rollback, strict-mode TypeScript checks, and a biconditional extension-admission gate; the four-level verification lattice is closed by four skill-formal primitives whose outputs a CLI packages into signed proof-carrying bundles that the runtime re-checks, moving a skill from tested to formal via static effect containment, refinement-typed dispatch, and bounded model checking.

What carries the argument

The biconditional extension-admission gate that extends the skill trust schema to non-skill extensions, combined with the closed four-level verification lattice using skill-formal-* primitives and the CLI that produces signed proof-carrying bundles.

If this is right

Regulated single-user deployments gain attestable peer trust and mandatory manifest signature verification without changing the underlying OpenClaw interface.
The 356-case test suite and real-time human-in-the-loop controls become part of the default release process for both open and enclaved flavors.
Skills can be raised from tested to formal status through static effect containment and bounded model checking packaged as signed bundles.
Deployers retain responsibility for hardware, validated crypto, and assessor sign-off while the framework supplies the software controls.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar hardening patterns could be applied to other AI gateway codebases that expose extension points and audit logs.
The separation of open and enclaved flavors suggests a practical way to offer graduated security levels inside the same codebase.
Closing the verification lattice at the formal level creates a reusable template for moving other AI components from ad-hoc testing to proof-carrying artifacts.
The memory-bounded rollback buffer and biconditional gate together imply a design that could limit damage from compromised extensions even if the model itself remains untrusted.

Load-bearing premise

The described mechanisms, including the biconditional extension-admission gate, strict-mode TypeScript checks, and memory-bounded transaction buffer, deliver the claimed security properties in real deployments without additional external validation or hardware assumptions.

What would settle it

A documented case in which an adversary forges a manifest signature, truncates the audit log, or bypasses the deny-by-default egress rules inside an otherwise correctly configured enclawed deployment would disprove the hardening claims.

read the original abstract

We present enclawed, a hard-fork hardening framework built on the OpenClaw AI assistant gateway. enclawed targets deployments that need attestable peer trust, deny-by-default external connectivity, signed-module loading, and a tamper-evident audit trail -- typically regulated industries (financial services, healthcare, defense, government). The framework ships in two flavors: an open flavor preserving OpenClaw compatibility while emitting audit, classification, and data-loss-prevention (DLP) signals, and an enclaved flavor activating strict allowlists, FIPS cryptographic-module assertion, mandatory manifest signature verification, and high-assurance peer attestation for the Model Context Protocol. The classification ladder is data-driven: deployers pick from five built-in presets or supply their own JSON. We ship a 356-case test suite (261 unit + 95 adversarial pen-tests) covering tamper detection, signature forgery, egress bypass, audit-log truncation, trust-root mutation, DLP evasion, prompt injection, code injection, and biconditional admission for net-capable extensions; real-time human-in-the-loop control; a memory-bounded transaction buffer with rollback; strict-mode TypeScript typecheck; and a CI workflow. The biconditional extension-admission gate extends the skill trust schema to non-skill extensions. The four-level verification lattice is now closed at the top: four skill-formal-* primitives plus a CLI produce a signed proof-carrying bundle the runtime re-checks at load, raising a skill from tested to formal via static effect-containment, refinement-typed dispatch, and bounded model checking. enclawed is a hardening framework, not an accredited certification; hardware, validated crypto, facilities, and assessor sign-off remain the deployer's responsibility.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

enclawed is a practical layering of standard controls and tests onto an existing AI gateway, useful for regulated single-user deployments but incremental rather than foundational.

read the letter

This paper describes a hardening framework for single-user AI assistant gateways, built as a hard fork of OpenClaw. It offers two flavors—one open for compatibility with added logging and DLP signals, one enclaved with strict allowlists, signature verification, and peer attestation—plus a classification ladder and a memory-bounded buffer. The central addition is a four-level verification lattice closed with skill-formal primitives, static checks, and a CLI that emits signed proof-carrying bundles at load time. A 356-case test suite hits tamper detection, signature forgery, prompt injection, and related vectors, with explicit human-in-the-loop controls and a biconditional gate for extensions. The authors state clearly that hardware, validated crypto, and assessor sign-off remain the deployer's job and that this is not an accredited certification. That honesty is a strength. The test coverage and structured lattice give a concrete way to move from basic testing toward higher assurance without overclaiming. The work stays within established techniques—signed modules, attestation, audit trails—so the advance is in the named combination, the lattice closure, and the test breadth rather than new primitives. Soft spots are the narrow single-user scope, which leaves multi-user or distributed settings unaddressed, and the lack of external validation or hardware assumptions in the claims. Without the shipped code and raw results, it is difficult to judge how well the mechanisms hold up beyond the described cases. This is for security engineers and compliance teams in finance, healthcare, or government who need a configurable reference for hardening AI gateways. A reader looking for applied patterns and test ideas will find usable material. I would send it to peer review. The explicit disclaimers and test details make it worth referee scrutiny on implementation strength and real-world fit.

Referee Report

0 major / 2 minor

Summary. The paper claims to present 'enclawed', a configurable, sector-neutral hardening framework for single-user AI assistant gateways based on OpenClaw. It provides two flavors: open (with audit, classification, and DLP signals) and enclaved (with allowlists, FIPS assertion, signature verification, and attestation). The framework includes a 356-case test suite covering tamper detection, signature forgery, various injections, and other attacks, along with a four-level verification lattice closed using skill-formal-* primitives, static effect containment, refinement-typed dispatch, bounded model checking, and a CLI for signed proof-carrying bundles. It emphasizes that it is a hardening framework, not an accredited certification, with responsibilities remaining with the deployer.

Significance. If the implementation and tests support the described features, this work contributes a practical tool for enhancing security, auditability, and trust in AI assistant deployments in regulated industries. The configurable presets, explicit disclaimers, and closure of the verification lattice with proof-carrying bundles represent strengths in making the framework deployable and verifiable at load time.

minor comments (2)

[Abstract] Several sentences in the abstract are quite long and could benefit from being split to improve readability, particularly the one listing the test suite coverage.
[Abstract] The term 'skill-formal-* primitives' is used without a brief definition or reference to prior work, which may hinder immediate understanding for readers.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of the enclawed framework, its test coverage, configurable presets, and verification lattice, as well as the recommendation for minor revision. The assessment accurately captures the distinction between the open and enclaved flavors and the explicit disclaimer that this is a hardening framework rather than an accredited certification.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The manuscript is a descriptive account of a configurable software hardening framework (open and enclaved flavors) with a 356-case test suite, audit/DLP signals, allowlists, signature verification, attestation, biconditional gates, strict TypeScript checks, memory-bounded buffers, and a four-level verification lattice closed via skill-formal-* primitives and a CLI for signed proof-carrying bundles. No mathematical derivations, predictions, fitted parameters, or equations appear. No self-citations are invoked as load-bearing premises, and no claims reduce by construction to their own inputs. The text explicitly disclaims accreditation and external responsibilities, keeping the presentation self-contained as an engineering description rather than a deductive chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the OpenClaw base and the added hardening mechanisms deliver the stated security properties in practice. No free parameters or new invented entities are introduced; the work is a software framework description.

axioms (1)

domain assumption The OpenClaw AI assistant gateway provides a suitable and secure foundation for the described hardening extensions.
The entire framework is built as a hard-fork of OpenClaw, so its security properties are presupposed.

pith-pipeline@v0.9.0 · 5629 in / 1322 out tokens · 58951 ms · 2026-05-12T01:46:22.803076+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages · 4 internal anchors

[1]

National Institute of Standards and Technology.Security and Privacy Controls for Infor- mation Systems and Organizations, NIST Special Publication 800-53 Rev. 5, 2020

work page 2020
[2]

National Institute of Standards and Technology.Security Requirements for Cryptographic Modules, FIPS 140-3, 2019

work page 2019
[3]

NationalInstituteofStandardsandTechnology.The NIST Cybersecurity Framework (CSF) 2.0, NIST CSWP 29, 2024

work page 2024
[4]

National Institute of Standards and Technology.Protecting Controlled Unclassified Infor- mation in Nonfederal Systems and Organizations, NIST SP 800-171 Rev. 2, 2020

work page 2020
[5]

International Organization for Standardization.ISO/IEC 27001:2022 — Information se- curity, cybersecurity and privacy protection — Information security management systems — Requirements, 2022

work page 2022
[6]

American Institute of Certified Public Accountants.Trust Services Criteria for Security, Availability, Processing Integrity, Confidentiality, and Privacy, 2022

work page 2022
[7]

EuropeanParliamentandCounciloftheEuropeanUnion.Regulation (EU) 2016/679 (Gen- eral Data Protection Regulation), 2016

work page 2016
[8]

Department of Health and Human Services.HIPAA Security Rule, 45 CFR Part 160 and Subparts A and C of Part 164

U.S. Department of Health and Human Services.HIPAA Security Rule, 45 CFR Part 160 and Subparts A and C of Part 164

work page
[9]

PCI Security Standards Council.Payment Card Industry Data Security Standard, v4.0, 2022

work page 2022
[10]

D. E. Bell and L. J. LaPadula.Secure Computer System: Unified Exposition and Multics Interpretation, MITRE Technical Report MTR-2997, 1976

work page 1976
[11]

Loscocco and S

P. Loscocco and S. Smalley. Integrating Flexible Support for Security Policies into the Linux Operating System. InProceedings of the FREENIX Track: 2001 USENIX Annual Technical Conference, 2001. 21

work page 2001
[12]

Haber and W

S. Haber and W. S. Stornetta. How to Time-Stamp a Digital Document.Journal of Cryp- tology, 3(2), 1991

work page 1991
[13]

Josefsson and I

S. Josefsson and I. Liusvaara.Edwards-Curve Digital Signature Algorithm (EdDSA), IETF RFC 8032, 2017

work page 2017
[14]

S. A. Crosby and D. S. Wallach. Efficient Data Structures for Tamper-Evident Logging. In 18th USENIX Security Symposium, 2009

work page 2009
[15]

Anthropic.Model Context Protocol Specification, 2024.https://modelcontextprotocol. io/

work page 2024
[16]

W. Kwon, Z. Li, S. Zhuang, Y. Sheng, L. Zheng, C. H. Yu, J. Gonzalez, H. Zhang, and I. Stoica. Efficient Memory Management for Large Language Model Serving with PagedAt- tention. InProceedings of the 29th Symposium on Operating Systems Principles (SOSP), 2023

work page 2023
[17]

SGLang: Efficient Execution of Structured Language Model Programs

L. Zheng et al.SGLang: Efficient Execution of Structured Language Model Programs. arXiv:2312.07104, 2023

work page internal anchor Pith review arXiv 2023
[18]

Ollama.https://github.com/ollama/ollama

work page
[19]

LM Studio.https://lmstudio.ai/

work page
[20]

Greshake, S

K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz. Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. In16th ACM Workshop on Artificial Intelligence and Security (AISec), 2023

work page 2023
[21]

Ignore Previous Prompt: Attack Techniques For Language Models

F. Perez and I. Ribeiro. Ignore Previous Prompt: Attack Techniques For Language Models. arXiv:2211.09527, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[22]

Schulhoff et al

S. Schulhoff et al. Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition. InProceedings of EMNLP, 2023

work page 2023
[23]

SPIFFE: Secure Production Identity Framework for Everyone.https://spiffe.io/

work page
[24]

Node.js Foundation.Node.js node:test module, 2024.https://nodejs.org/api/test. html

work page 2024
[25]

OpenClaw upstream repository.https://github.com/openclaw/openclaw

work page
[26]

National Institute of Standards and Technology.Artificial Intelligence Risk Management Framework (AI RMF 1.0), NIST AI 100-1, January 2023

work page 2023
[27]

International Organization for Standardization.ISO/IEC 42001:2023 — Information tech- nology — Artificial intelligence — Management system, 2023

work page 2023
[28]

European Parliament and Council of the European Union.Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act), June 2024

work page 2024
[29]

OWASP Foundation.OWASP Top 10 for Large Lan- guage Model Applications, 2025 edition.https://owasp.org/ www-project-top-10-for-large-language-model-applications/. 22

work page 2025
[30]

The MITRE Corporation.Adversarial Threat Landscape for Artificial-Intelligence Systems (ATLAS).https://atlas.mitre.org/

work page
[31]

Carlini, F

N. Carlini, F. Tramer, E. Wallace, M. Jagielski, A. Herbert-Voss, K. Lee, A. Roberts, T. Brown, D. Song, U. Erlingsson, A. Oprea, and C. Raffel. Extracting Training Data from Large Language Models. In30th USENIX Security Symposium, 2021

work page 2021
[32]

M. Nasr, N. Carlini, J. Hayase, M. Jagielski, A. F. Cooper, D. Ippolito, C. A. Choquette- Choo, E. Wallace, F. Tramer, and K. Lee.Scalable Extraction of Training Data from (Production) Language Models. arXiv:2311.17035, 2023

work page arXiv 2023
[33]

National Institute of Standards and Technology.Secure Software Development Framework (SSDF) Version 1.1, NIST SP 800-218, February 2022

work page 2022
[34]

National Security Agency, and U.S

Cybersecurity and Infrastructure Security Agency, U.S. National Security Agency, and U.S. Federal Bureau of Investigation.Engaging with Artificial Intelligence: Joint Guidance for Secure Deployment, 2024

work page 2024
[35]

Air Canada, 2024 BCCRT 149, February 2024

Civil Resolution Tribunal of British Columbia.Moffatt v. Air Canada, 2024 BCCRT 149, February 2024

work page 2024
[36]

K. J. Biba.Integrity Considerations for Secure Computer Systems, MITRE Technical Re- port MTR-3153, 1977

work page 1977
[37]

D. D. Clark and D. R. Wilson. A Comparison of Commercial and Military Computer Security Policies. InProceedings of the IEEE Symposium on Security and Privacy, 1987

work page 1987
[38]

D. F. C. Brewer and M. J. Nash. The Chinese Wall Security Policy. InProceedings of the IEEE Symposium on Security and Privacy, 1989

work page 1989
[39]

R. S. Sandhu, E. J. Coyne, H. L. Feinstein, and C. E. Youman. Role-Based Access Control Models.IEEE Computer, 29(2):38–47, February 1996

work page 1996
[40]

V. C. Hu, D. Ferraiolo, R. Kuhn, A. Schnitzer, K. Sandlin, R. Miller, and K. Scarfone. Guide to Attribute Based Access Control (ABAC) Definition and Considerations, NIST Special Publication 800-162, January 2014

work page 2014
[41]

J. H. Saltzer and M. D. Schroeder. The Protection of Information in Computer Systems. Proceedings of the IEEE, 63(9):1278–1308, September 1975

work page 1975
[42]

J. Gray. The Transaction Concept: Virtues and Limitations. InProceedings of the 7th International Conference on Very Large Data Bases (VLDB), 1981

work page 1981
[43]

Avila and contributors

D. Avila and contributors. LibreChat — enhanced ChatGPT clone with multi-provider, plugins, and agents.https://github.com/danny-avila/LibreChat

work page
[44]

Open WebUI — user-friendly self-hosted AI interface (formerly ollama-webui).https://github.com/open-webui/open-webui

Open WebUI Project. Open WebUI — user-friendly self-hosted AI interface (formerly ollama-webui).https://github.com/open-webui/open-webui

work page
[45]

AnythingLLM — the all-in-one AI app for any LLM with full RAG and agent capabilities.https://github.com/Mintplex-Labs/anything-llm

Mintplex Labs. AnythingLLM — the all-in-one AI app for any LLM with full RAG and agent capabilities.https://github.com/Mintplex-Labs/anything-llm

work page
[46]

Jan — bring AI to your desktop

Jan.ai Project. Jan — bring AI to your desktop. Open-source ChatGPT alternative, runs 100% offline.https://github.com/janhq/jan

work page
[47]

Lobe Chat — open-source modern-design AI chat framework.https://github

LobeHub. Lobe Chat — open-source modern-design AI chat framework.https://github. com/lobehub/lobe-chat. 23

work page
[48]

Q. Wu, G. Bansal, J. Zhang, Y. Wu, B. Li, E. Zhu, L. Jiang, X. Zhang, S. Zhang, J. Liu, A. H. Awadallah, R. W. White, D. Burger, and C. Wang. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. arXiv:2308.08155, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[49]

Chase and the LangChain community

H. Chase and the LangChain community. LangChain — building applications with LLMs through composability.https://github.com/langchain-ai/langchain

work page
[50]

Malik Sallam

T. Rebedea, R. Dinu, M. Sreedhar, C. Parisien, and J. Cohen. NeMo Guardrails: A Toolkit for Controllable and Safe LLM Applications with Programmable Rails. arXiv:2310.10501, 2023

work page arXiv 2023
[51]

H. Inan, K. Upasani, J. Chi, R. Rungta, K. Iyer, Y. Mao, M. Tontchev, Q. Hu, B. Fuller, D. Testuggine, and M. Khabsa. Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations. arXiv:2312.06674, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[52]

Lakera Guard — protect your LLM applications from prompt injection, PII leaks, and data poisoning.https://www.lakera.ai/lakera-guard

Lakera AI. Lakera Guard — protect your LLM applications from prompt injection, PII leaks, and data poisoning.https://www.lakera.ai/lakera-guard

work page
[53]

A.Shamir.HowtoShareaSecret.Communications of the ACM,22(11):612–613, November 1979

work page 1979
[54]

Lamport, R

L. Lamport, R. Shostak, and M. Pease. The Byzantine Generals Problem.ACM Transac- tions on Programming Languages and Systems, 4(3):382–401, July 1982

work page 1982
[55]

Federal Cryptographic Key Management Systems, NIST Special Publication 800-152, October 2015

National Institute of Standards and Technology.A Profile for U.S. Federal Cryptographic Key Management Systems, NIST Special Publication 800-152, October 2015

work page 2015
[56]

Metere (Metere Consulting, LLC.).enclawed — open-source half

A. Metere (Metere Consulting, LLC.).enclawed — open-source half. Public repository. https://github.com/metereconsulting/enclawed. 24

work page