Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems

· 2026 · cs.SE · arXiv 2604.14228

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

open full Pith review browse 7 citing papers arXiv PDF

abstract

Claude Code is an agentic coding tool that can run shell commands, edit files, and call external services on behalf of the user. This study describes its comprehensive architecture by analyzing the publicly available TypeScript source code and further comparing it with OpenClaw, an independent open-source AI agent system that answers many of the same design questions from a different deployment context. Our analysis identifies five human values, philosophies, and needs that motivate the architecture (human decision authority, safety and security, reliable execution, capability amplification, and contextual adaptability) and traces them through thirteen design principles to specific implementation choices. The core of the system is a simple while-loop that calls the model, runs tools, and repeats. Most of the code, however, lives in the systems around this loop: a permission system with seven modes and an ML-based classifier, a five-layer compaction pipeline for context management, four extensibility mechanisms (MCP, plugins, skills, and hooks), a subagent delegation mechanism with worktree isolation, and append-oriented session storage. A comparison with OpenClaw, a multi-channel personal assistant gateway, shows that the same recurring design questions produce different architectural answers when the deployment context changes: from per-action safety classification to perimeter-level access control, from a single CLI loop to an embedded runtime within a gateway control plane, and from context-window extensions to gateway-wide capability registration. We finally identify six open design directions for future agent systems, grounded in recent empirical, architectural, and policy literature.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Model-Native Computing Architecture: Envisioning Future System Architecture Through the Lens of Computer Architecture

cs.AI · 2026-05-29 · unverdicted · novelty 7.0

Proposes the Intelligent Computing Architecture (ICA) as a six-layer framework with dual probabilistic-deterministic planes and three Amdahl-style heuristics to unify design of LLM-based systems.

Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours

cs.AI · 2026-05-05 · unverdicted · novelty 6.0

An agentic red teaming system automates creation of adversarial testing workflows from natural language goals, unifying ML and generative AI attacks and achieving 85% success rate on Meta Llama Scout with no custom human code.

HARBOR: Automated Harness Optimization

cs.LG · 2026-04-22 · unverdicted · novelty 6.0

HARBOR formalizes harness optimization as constrained noisy Bayesian optimization over mixed-variable spaces and reports a case study where it outperforms manual tuning on a production coding agent.

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

cs.CR · 2026-05-31 · unverdicted · novelty 5.0

BraveGuard trains guard models on realistic agent trajectories derived from open-world threats, raising detection accuracy on AgentHazard from 38.79% to 82.38%.

Code as Agent Harness

cs.CL · 2026-05-18 · accept · novelty 5.0

A survey that organizes existing work on LLM-based agents around code as the central harness, structured in three layers of interfaces, mechanisms, and multi-agent scaling, with applications across domains and listed open challenges.

Decision Evidence Maturity Model for Agentic AI: A Property-Level Method Specification

cs.CY · 2026-04-29 · unverdicted · novelty 4.0

DEMM defines four executable evidence-sufficiency categories plus a conflicting category for agentic AI decisions and rolls per-property verdicts into a five-level maturity rubric.

"Skill issues'': data-centric optimization of lakehouse agents

cs.AI · 2026-05-31 · unverdicted · novelty 3.0

Data-centric optimization of skills for agents on a branching lakehouse improves accuracy by 31.9% on 25 tasks via state-verification evaluation.

citing papers explorer

Showing 7 of 7 citing papers.

Model-Native Computing Architecture: Envisioning Future System Architecture Through the Lens of Computer Architecture cs.AI · 2026-05-29 · unverdicted · none · ref 93 · internal anchor
Proposes the Intelligent Computing Architecture (ICA) as a six-layer framework with dual probabilistic-deterministic planes and three Amdahl-style heuristics to unify design of LLM-based systems.
Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours cs.AI · 2026-05-05 · unverdicted · none · ref 25 · internal anchor
An agentic red teaming system automates creation of adversarial testing workflows from natural language goals, unifying ML and generative AI attacks and achieving 85% success rate on Meta Llama Scout with no custom human code.
HARBOR: Automated Harness Optimization cs.LG · 2026-04-22 · unverdicted · none · ref 7 · internal anchor
HARBOR formalizes harness optimization as constrained noisy Bayesian optimization over mixed-variable spaces and reports a case study where it outperforms manual tuning on a production coding agent.
BraveGuard: From Open-World Threats to Safer Computer-Use Agents cs.CR · 2026-05-31 · unverdicted · none · ref 22 · internal anchor
BraveGuard trains guard models on realistic agent trajectories derived from open-world threats, raising detection accuracy on AgentHazard from 38.79% to 82.38%.
Code as Agent Harness cs.CL · 2026-05-18 · accept · none · ref 217 · internal anchor
A survey that organizes existing work on LLM-based agents around code as the central harness, structured in three layers of interfaces, mechanisms, and multi-agent scaling, with applications across domains and listed open challenges.
Decision Evidence Maturity Model for Agentic AI: A Property-Level Method Specification cs.CY · 2026-04-29 · unverdicted · none · ref 19 · internal anchor
DEMM defines four executable evidence-sufficiency categories plus a conflicting category for agentic AI decisions and rolls per-property verdicts into a five-level maturity rubric.
"Skill issues'': data-centric optimization of lakehouse agents cs.AI · 2026-05-31 · unverdicted · none · ref 22 · internal anchor
Data-centric optimization of skills for agents on a branching lakehouse improves accuracy by 31.9% on 25 tasks via state-verification evaluation.

Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer