arxiv: 2604.11045 · v1 · submitted 2026-04-13 · 💻 cs.SE

Recognition: unknown

Sema Code: Decoupling AI Coding Agents into Programmable, Embeddable Infrastructure

Huacan Wang , Jie Zhou , Ningyan Zhu , Shuo Zhang , Feiyu Chen , Jiarou Wu , Ge Chen , Chen Liu

show 3 more authors

Wangyi Chen Xiaofeng Mou Yi Xu

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:36 UTC · model grok-4.3

classification 💻 cs.SE

keywords AI coding agentsdecouplingembeddable frameworknpm librarymulti-agent schedulingpermission controlcontext compressionprogrammable infrastructure

0 comments

The pith

Sema Code releases the AI coding agent engine as a standalone npm library that any application can embed and drive programmatically.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Existing AI coding agents are locked into specific interfaces such as command-line tools or IDE plugins, which prevents their reuse across different developer environments. The paper introduces Sema Code as a framework that separates the core reasoning engine from these client layers. It publishes the engine as an npm package that any runtime can control through code. Eight supporting mechanisms handle practical issues like isolation between users, input management, context handling, and permissions. This design lets the same agent logic power both a desktop editor extension and a chat-based gateway without changes to the underlying intelligence.

Core claim

Sema Code completely decouples the core agent engine from all client layers, publishing it as a standalone npm library that any runtime can drive programmatically. Built around this architecture, the framework implements eight key mechanisms including multi-tenant engine isolation, FIFO input queuing with safe session reconstruction, adaptive context compression, multi-agent collaborative scheduling, intelligent Todo-based process management, four-layer asynchronous permission control, three-tier ecosystem integration, and a background task framework. The same Sema Core engine simultaneously powers a VSCode extension and a multi-channel messaging gateway called SemaClaw, demonstrating that a

What carries the argument

The standalone npm library containing the Sema Core engine, which any runtime drives programmatically through the eight designed mechanisms for embeddability.

If this is right

The identical reasoning kernel can support both graphical IDE integrations and text-based messaging interfaces without modification.
Enterprises gain the ability to integrate the agent into custom internal tools and workflows.
Development teams can extend the agent through plugins, skills, and MCP integrations at the ecosystem layer.
Background tasks can run with separated execution and observation privileges to maintain security.
Multi-tenant setups become feasible for shared agent resources across an organization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This approach could enable third-party developers to create entirely new client applications on top of the core agent without needing access to the original source.
Similar decoupling patterns might apply to other types of AI agents beyond coding, such as data analysis or design tools.
Adoption would reduce duplication of effort in building agent interfaces for different platforms.

Load-bearing premise

The eight mechanisms can be layered onto the agent engine without reducing the quality of its code reasoning or introducing significant delays in responses.

What would settle it

A side-by-side test where the same coding task is given to both the original locked agent and an embedded Sema Code instance, measuring if suggestion accuracy drops or response time increases beyond acceptable limits.

read the original abstract

AI coding agents have become central to developer workflows, yet every existing solution locks its reasoning capabilities within a specific delivery form, such as a CLI, IDE plugin, or web application. This limitation creates systemic barriers when enterprises attempt to reuse these capabilities across heterogeneous engineering environments. To address this challenge, we present Sema Code, an open AI coding framework built on the principle of being embeddable, pluggable, and framework-first. Sema Code completely decouples the core agent engine from all client layers, publishing it as a standalone npm library that any runtime can drive programmatically. Built around this architecture, we designed eight key mechanisms: multi-tenant engine isolation, FIFO input queuing with safe session reconstruction, adaptive context compression, multi-agent collaborative scheduling, intelligent Todo-based process management, four-layer asynchronous permission control, three-tier ecosystem integration spanning MCP, Skills, and Plugins, and a background task framework with separated execution and observation privileges. These mechanisms collectively address the engineering challenges of transforming a complex agent engine into a shared, programmable core. Demonstrating its architectural versatility, the same Sema Core engine simultaneously powers a VSCode extension and a multi-channel messaging gateway, which we name SemaClaw, to unify agent interactions across platforms such as Telegram and Feishu. These represent two fundamentally different product forms sharing an identical reasoning kernel, differing only at the client layer.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Sema Code describes a concrete npm-based architecture for embedding AI coding agents with eight supporting mechanisms, but supplies no measurements to show those additions preserve reasoning quality or avoid latency.

read the letter

The main takeaway is that this paper turns an AI coding agent into a standalone npm library so different interfaces can drive it without duplicating the core. They detail eight mechanisms—multi-tenant isolation, FIFO queuing, context compression, multi-agent scheduling, todo management, permission controls, ecosystem integration, and background tasks—and demonstrate the same engine running both a VSCode extension and a messaging gateway called SemaClaw.

Referee Report

2 major / 2 minor

Summary. The paper presents Sema Code, an open AI coding framework that decouples the core agent engine from client-specific layers by releasing it as a standalone npm library. It describes eight mechanisms (multi-tenant isolation, FIFO queuing with session reconstruction, adaptive context compression, multi-agent scheduling, Todo-based management, four-layer permission control, three-tier ecosystem integration, and background task separation) that enable the engine to be shared across runtimes. The architecture is illustrated by the same core simultaneously powering a VSCode extension and a multi-channel messaging gateway (SemaClaw) for platforms such as Telegram and Feishu.

Significance. If the mechanisms can be shown to preserve reasoning quality and avoid latency penalties, the work would offer a practical path to reusable AI coding infrastructure, reducing lock-in across IDEs, chat interfaces, and custom applications. The framework-first design and concrete dual-product demonstration are constructive contributions to embeddable agent systems.

major comments (2)

[Abstract] Abstract: the claim that the core engine 'completely decouples' and remains a 'drop-in programmable library' after incorporation of the eight mechanisms (especially adaptive context compression and multi-agent scheduling) is load-bearing but unsupported. No before/after benchmarks on output equivalence, token usage, or latency are supplied, leaving open whether the added layers alter the underlying agent's behavior.
[Architectural demonstration] Architectural demonstration (SemaClaw and VSCode sections): the manuscript states that the identical reasoning kernel drives two fundamentally different client layers, yet provides neither API documentation for the npm library, installation or usage examples, nor any reproducibility artifact that would allow an independent runtime to drive the engine programmatically.

minor comments (2)

[Abstract] The eight mechanisms are enumerated without an accompanying diagram or table that maps each mechanism to the specific engineering challenge it addresses and to its interaction with the core engine.
[Introduction] Related-work positioning is thin; explicit comparisons to other attempts at embeddable or library-form agent engines would help readers assess novelty.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the core engine 'completely decouples' and remains a 'drop-in programmable library' after incorporation of the eight mechanisms (especially adaptive context compression and multi-agent scheduling) is load-bearing but unsupported. No before/after benchmarks on output equivalence, token usage, or latency are supplied, leaving open whether the added layers alter the underlying agent's behavior.

Authors: We agree that the absence of quantitative benchmarks leaves the decoupling claims open to the interpretation raised. The manuscript demonstrates architectural separation through the shared core powering two distinct clients, but does not include controlled before/after measurements. In the revised manuscript we will add an evaluation subsection that reports output equivalence (measured by semantic similarity of generated artifacts), token usage, and latency for representative coding tasks executed with and without the eight mechanisms. This will directly test whether the isolation, queuing, compression, and scheduling layers preserve the underlying agent's behavior. revision: yes
Referee: [Architectural demonstration] Architectural demonstration (SemaClaw and VSCode sections): the manuscript states that the identical reasoning kernel drives two fundamentally different client layers, yet provides neither API documentation for the npm library, installation or usage examples, nor any reproducibility artifact that would allow an independent runtime to drive the engine programmatically.

Authors: We acknowledge that the current manuscript describes the mechanisms at a high level without supplying the concrete API surface, installation instructions, or runnable examples needed for independent programmatic use. In the revision we will add a dedicated section (or appendix) that documents the public npm API, provides installation and basic usage snippets, and illustrates how a third-party runtime can instantiate and drive the core engine. We will also release the library and minimal driver examples in a public repository to enable direct reproducibility. revision: yes

Circularity Check

0 steps flagged

No circularity: purely descriptive architecture paper with no derivations or self-referential reductions

full rationale

The manuscript presents Sema Code as an embeddable npm library that decouples the agent engine from client layers, supported by eight listed mechanisms (multi-tenant isolation, FIFO queuing, context compression, etc.). These are introduced as design choices that address engineering challenges, with no equations, fitted parameters, predictions, or load-bearing self-citations. The central claim of functional equivalence across product forms (VSCode extension and SemaClaw) is asserted via the shared kernel but rests on architectural description rather than any chain that reduces to its own inputs by construction. No self-definitional loops, renamed empirical patterns, or uniqueness theorems appear.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a software engineering architecture paper. No mathematical free parameters, axioms, or invented physical entities are introduced; the mechanisms are engineering design choices.

pith-pipeline@v0.9.0 · 5575 in / 1015 out tokens · 55250 ms · 2026-05-10T15:36:57.628690+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

9 extracted references · 9 canonical work pages · 8 internal anchors

[1]

Evaluating Large Language Models Trained on Code

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. Evaluating large language models trained on code.arXiv preprint arXiv:2107.03374,

work page internal anchor Pith review Pith/arXiv arXiv
[2]

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Daya Guo, Qihao Zhu, Dejian Yang, et al. DeepSeek-Coder: When the large language model meets programming – the rise of code intelligence.arXiv preprint arXiv:2401.14196,

work page internal anchor Pith review arXiv
[3]

MemGPT: Towards LLMs as Operating Systems

Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez. MemGPT: Towards LLMs as operating systems.arXiv preprint arXiv:2310.08560,

work page internal anchor Pith review Pith/arXiv arXiv
[4]

The Impact of AI on Developer Productivity: Evidence from GitHub Copilot

Sida Peng, Eirini Kalliamvakou, Peter Cihon, and Mert Demirer. The impact of AI on developer productivity: Evidence from GitHub Copilot.arXiv preprint arXiv:2302.06590,

work page internal anchor Pith review Pith/arXiv arXiv
[5]

Code Llama: Open Foundation Models for Code

15 Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, et al. Code llama: Open foundation models for code.arXiv preprint arXiv:2308.12950,

work page internal anchor Pith review Pith/arXiv arXiv
[6]

Preprint at https://arxiv.org/abs/2603.19469, arXiv:2603.19469

Vincent Siu, Jingxuan He, Kyle Montgomery, Zhun Wang, Neil Gong, Chenguang Wang, and Dawn Song. A framework for formalizing LLM agent security.arXiv preprint arXiv:2603.19469,

work page arXiv
[7]

OpenHands: An Open Platform for AI Software Developers as Generalist Agents

Lei Wang, Chen Ma, Xueyang Feng, et al. A survey on large language model based autonomous agents.Frontiers of Computer Science, 2024a. Xingyao Wang et al. OpenHands: An open platform for AI software developers as generalist agents.arXiv preprint arXiv:2407.16741, 2024b. Jason Wei, Xuezhi Wang, Dale Schuurmans, et al. Chain-of-thought prompting elicits rea...

work page internal anchor Pith review arXiv
[8]

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Qingyun Wu, Gagan Bansal, Jieyu Zhang, et al. AutoGen: Enabling next-gen LLM applications via multi-agent conversation.arXiv preprint arXiv:2308.08155,

work page internal anchor Pith review Pith/arXiv arXiv
[9]

Agentless: Demystifying LLM-based Software Engineering Agents

Chunqiu Steven Xia, Yinlin Deng, Soren Dunn, and Lingming Zhang. Agentless: Demystifying LLM-based software engineering agents.arXiv preprint arXiv:2407.01489,

work page internal anchor Pith review arXiv