Recognition: no theorem link
Architecture Without Architects: How AI Coding Agents Shape Software Architecture
Pith reviewed 2026-05-13 17:16 UTC · model grok-4.3
The pith
AI coding agents make implicit architectural decisions based on prompt wording alone.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AI coding agents select frameworks, scaffold infrastructure, and wire integrations often in seconds. These constitute architectural decisions made implicitly through five mechanisms. Six prompt-architecture coupling patterns map natural-language prompt features to the infrastructure they require, with patterns ranging from contingent couplings such as structured output validation that may weaken as models improve to fundamental ones such as tool-call orchestration that persist regardless of model capability. An illustrative demonstration confirms that prompt wording alone produces structurally different systems for the same task. The phenomenon is termed vibe architecting.
What carries the argument
Six prompt-architecture coupling patterns that map natural-language prompt features to the infrastructure they require.
If this is right
- Architectural review of AI-generated code must examine the prompt features that triggered framework and integration choices.
- Some coupling patterns will weaken with model improvements while others remain independent of capability gains.
- Decision records should capture the prompt elements that shaped infrastructure decisions.
- Tooling is required to surface and govern these previously hidden choices during development.
Where Pith is reading between the lines
- Prompt templates could be standardized to produce consistent architectural outcomes across similar tasks.
- The same mechanisms may interact with model training data biases in ways that compound over successive generations of code.
- Developers could treat prompt variation as an explicit design variable to explore alternative architectures before committing to one.
Load-bearing premise
Differences in generated systems for the same task are caused primarily by prompt wording rather than model stochasticity, training data, or other uncontrolled factors, and that these differences qualify as architectural decisions.
What would settle it
Re-running the illustrative demonstration with identical prompts but fixed random seeds across multiple model calls and checking whether structurally different systems still appear.
Figures
read the original abstract
AI coding agents select frameworks, scaffold infrastructure, and wire integrations, often in seconds. These are architectural decisions, yet almost no one reviews them as such. We identify five mechanisms by which agents make implicit architectural choices and propose six prompt-architecture coupling patterns that map natural-language prompt features to the infrastructure they require. The patterns range from contingent couplings (structured output validation) that may weaken as models improve to fundamental ones (tool-call orchestration) that persist regardless of model capability. An illustrative demonstration confirms that prompt wording alone produces structurally different systems for the same task. We term the phenomenon vibe architecting, architecture shaped by prompts rather than deliberate design, and outline review practices, decision records, and tooling to bring these hidden decisions under governance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that AI coding agents make implicit architectural decisions via five mechanisms, proposes six prompt-architecture coupling patterns (ranging from contingent to fundamental) that map natural-language prompt features to infrastructure requirements, demonstrates via illustration that prompt wording alone yields structurally different systems for identical tasks, introduces the term 'vibe architecting', and outlines governance practices including review processes and tooling.
Significance. If the mechanisms and patterns hold under controlled conditions, the work would be significant for software engineering by surfacing how prompt engineering influences architecture in AI-assisted development and by proposing actionable governance approaches. The conceptual framing is timely and identifies a previously under-examined phenomenon, though its current reliance on an uncontrolled illustration limits immediate impact.
major comments (1)
- [Illustrative Demonstration] The illustrative demonstration (abstract and associated section) asserts that 'prompt wording alone produces structurally different systems' but provides no details on temperature, top-p, seeds, number of replicates per prompt, or statistical tests comparing within-prompt versus between-prompt variance. This omission is load-bearing for the five mechanisms and six coupling patterns, as observed differences may stem from sampling noise rather than the claimed prompt features.
minor comments (1)
- [Abstract] The abstract introduces 'vibe architecting' without a concise definition; a one-sentence operational definition would improve readability for readers unfamiliar with the framing.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for recognizing the timeliness of examining how AI coding agents influence software architecture. We agree that the illustrative demonstration requires greater methodological transparency and rigor to support the five mechanisms and six coupling patterns. We will revise the manuscript to address this by adding the requested details on generation parameters, replicates, and variance analysis.
read point-by-point responses
-
Referee: [Illustrative Demonstration] The illustrative demonstration (abstract and associated section) asserts that 'prompt wording alone produces structurally different systems' but provides no details on temperature, top-p, seeds, number of replicates per prompt, or statistical tests comparing within-prompt versus between-prompt variance. This omission is load-bearing for the five mechanisms and six coupling patterns, as observed differences may stem from sampling noise rather than the claimed prompt features.
Authors: We acknowledge that the current illustrative demonstration does not report the generation hyperparameters or perform replicates with statistical validation, which limits the strength of the evidence. While the demonstration was designed to show qualitative structural differences arising from prompt variations rather than to serve as a controlled empirical study, we agree this creates ambiguity about sampling noise. In the revised manuscript we will: specify the exact model settings (temperature, top-p, and seed where applicable), generate a minimum of five replicates per prompt variant for the same task, document the observed architectural differences across replicates, and include a comparison of within-prompt versus between-prompt structural variance (using both qualitative descriptions and, where feasible, simple similarity metrics). These additions will make the support for the coupling patterns more robust without altering the illustrative nature of the section. revision: yes
Circularity Check
No circularity: observational patterns with no derivations or self-referential reductions
full rationale
The paper identifies five mechanisms and proposes six prompt-architecture coupling patterns as direct observations from an illustrative demonstration. No equations, fitted parameters, or closed derivation chains exist. The patterns are presented as mappings derived from prompt wording differences rather than outputs forced by construction from prior inputs or self-citations. The demonstration is described as confirmatory but not as a statistical model whose outputs are renamed as predictions. This is a standard non-circular conceptual proposal.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption AI coding agents make decisions that qualify as architectural when they select frameworks, scaffold infrastructure, or wire integrations.
Forward citations
Cited by 1 Pith paper
-
Formal Architecture Descriptors as Navigation Primitives for AI Coding Agents
Formal architecture descriptors reduce AI coding agent navigation steps by 33-44% and behavioral variance by 52% in controlled and observational studies.
Reference graph
Works this paper leans on
-
[1]
There’s a new kind of coding I call ‘vibe coding’,
A. Karpathy, “There’s a new kind of coding I call ‘vibe coding’,” X/Twitter, Feb. 2025, [Online]. Available: https://x.com/karpathy/status/ 1886192184808149383
work page 2025
-
[2]
Anthropic, “Claude Code documentation,” [Online]. Available: https:// code.claude.com/docs/en/overview, 2025, accessed: Feb. 8, 2026
work page 2025
-
[3]
Cursor, “Cursor documentation,” [Online]. Available: https://cursor.com/ docs/agent, 2025, accessed: Feb. 8, 2026
work page 2025
-
[4]
Cognition, “Devin 2.0,” [Online]. Available: https://cognition.ai/blog/ devin-2, 2025, accessed: Feb. 8, 2026
work page 2025
- [5]
-
[6]
Available: https://openai.com/index/ introducing-codex/, 2025, accessed: Feb
OpenAI, “Codex,” [Online]. Available: https://openai.com/index/ introducing-codex/, 2025, accessed: Feb. 8, 2026
work page 2025
-
[7]
Windsurf, “Windsurf editor,” [Online]. Available: https://windsurf.com/, 2025, accessed: Feb. 8, 2026
work page 2025
-
[8]
LangChain, “LangChain framework, v1.2,” [Online]. Available: https: //github.com/langchain-ai/langchain, 2025, accessed: Feb. 8, 2026
work page 2025
-
[9]
LlamaIndex, “LlamaIndex framework, v0.14,” [Online]. Available: https: //github.com/run-llama/llama index, 2025, accessed: Feb. 8, 2026
work page 2025
-
[10]
A prompt pattern catalog to enhance prompt engineering with chatgpt,
J. Whiteet al., “A prompt pattern catalog to enhance prompt engineering with ChatGPT,”arXiv preprint arXiv:2302.11382, 2023
-
[11]
Promptware engineering: Software engineering for prompt-enabled systems,
Z. Chen, C. Wang, W. Sun, X. Liu, J. M. Zhang, and Y . Liu, “Promptware engineering: Software engineering for prompt-enabled systems,”ACM Trans. Softw. Eng. Methodol., 2026, to appear. Preprint: arXiv:2503.02400
-
[12]
A survey on large language model based autonomous agents,
L. Wanget al., “A survey on large language model based autonomous agents,”Frontiers of Computer Science, vol. 18, no. 6, 2024
work page 2024
-
[13]
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Q. Wuet al., “AutoGen: Enabling next-gen LLM applications via multi- agent conversation,”arXiv preprint arXiv:2308.08155, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[14]
Software architecture meets LLMs: A systematic literature review,
L. Schmidet al., “Software architecture meets LLMs: A systematic literature review,”arXiv preprint arXiv:2505.16697, 2025
-
[15]
Hidden technical debt in machine learning systems,
D. Sculleyet al., “Hidden technical debt in machine learning systems,” inAdvances in Neural Information Processing Systems, vol. 28, 2015, pp. 2503–2511
work page 2015
-
[16]
TODO: Fix the mess Gemini created: Towards understanding GenAI-induced self-admitted technical debt,
A. Al Mujahid and M. M. Imran, “TODO: Fix the mess Gemini created: Towards understanding GenAI-induced self-admitted technical debt,” inProc. 9th Int. Conf. Technical Debt (TechDebt), 2026, preprint: arXiv:2601.07786
-
[17]
The modular im- perative: Rethinking LLMs for maintainable software,
A. Kravchuk-Kirilyuk, F. Graciolli, and N. Amin, “The modular im- perative: Rethinking LLMs for maintainable software,” inProc. 1st ACM SIGPLAN Int. Workshop on Language Models and Programming Languages (LMPL), 2025
work page 2025
-
[18]
Vibe coding: Programming through conversa- tion with artificial intelligence,
A. Sarkar and I. Drosos, “Vibe coding: Programming through conversa- tion with artificial intelligence,”arXiv preprint arXiv:2506.23253, 2025
-
[19]
Vibe coding in practice: Moti- vations, challenges, and a future outlook – a grey literature review,
A. Fawzy, A. Tahir, and K. Blincoe, “Vibe coding in practice: Moti- vations, challenges, and a future outlook – a grey literature review,” in Proc. 47th IEEE/ACM Int. Conf. Softw. Eng. (ICSE), SEIP Track, 2026, preprint: arXiv:2510.00328
-
[20]
Toward self-coding information systems,
R. Falc ˜ao, F. Elberzhager, and K. Vaidhyanathan, “Toward self-coding information systems,”arXiv preprint arXiv:2601.14132, Jan. 2026
-
[21]
Agentic Much? Adoption of Coding Agents on GitHub
R. Robbes, T. Matricon, T. Degueule, A. Hora, and S. Zacchiroli, “Agentic much? adoption of coding agents on GitHub,”arXiv preprint arXiv:2601.18341, Jan. 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[22]
GitHub, “GitHub Copilot,” [Online]. Available: https://docs.github.com/ en/copilot, 2025, accessed: Feb. 8, 2026
work page 2025
-
[23]
Available: https://kiro.dev/, 2025, accessed: Feb
AWS, “Kiro,” [Online]. Available: https://kiro.dev/, 2025, accessed: Feb. 8, 2026
work page 2025
-
[24]
Replit, “Replit Agent,” [Online]. Available: https://docs.replit.com/ replitai/agent, 2025, accessed: Feb. 8, 2026
work page 2025
-
[25]
L. Bass, P. Clements, and R. Kazman,Software Architecture in Practice, 4th ed. Boston, MA, USA: Addison-Wesley, 2021
work page 2021
-
[26]
Model Context Protocol specification,
Anthropic, “Model Context Protocol specification,” [Online]. Avail- able: https://modelcontextprotocol.io/specification/2025-11-25, 2025, accessed: Feb. 8, 2026
work page 2025
-
[27]
Linux Foundation, “Agentic AI foundation,” [Online]. Available: https: //aaif.io/, Dec. 2025, accessed: Feb. 8, 2026
work page 2025
-
[28]
The coding personalities of leading LLMs,
SonarSource, “The coding personalities of leading LLMs,” State of Code Report, Aug. 2025. [Online]. Available: https://www.sonarsource. com/the-coding-personalities-of-leading-llms/, 2025, accessed: Feb. 13, 2026
work page 2025
-
[29]
Taming throughput-latency tradeoff in LLM infer- ence with Sarathi-Serve,
A. Agrawalet al., “Taming throughput-latency tradeoff in LLM infer- ence with Sarathi-Serve,” inProc. 18th USENIX Symp. Operating Syst. Design Implementation (OSDI), 2024
work page 2024
-
[30]
OpenAI, “Function calling,” OpenAI Platform Documentation. [Online]. Available: https://platform.openai.com/docs/guides/function-calling, 2025, accessed: Feb. 8, 2026
work page 2025
-
[31]
Anthropic, “Tool use (function calling),” Anthropic Documentation. [On- line]. Available: https://docs.anthropic.com/en/docs/build-with-claude/ tool-use, 2025, accessed: Feb. 8, 2026
work page 2025
-
[32]
ReAct: Synergizing reasoning and acting in language models,
S. Yaoet al., “ReAct: Synergizing reasoning and acting in language models,” inProc. Int. Conf. Learn. Representations (ICLR), 2023
work page 2023
-
[33]
K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world LLM- integrated applications with indirect prompt injection,” inProc. 16th ACM Workshop on Artificial Intelligence and Security (AISec), 2023, pp. 79–90
work page 2023
-
[34]
Retrieval-augmented generation for knowledge-intensive NLP tasks,
P. Lewiset al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” inAdvances in Neural Information Processing Systems, vol. 33, 2020, pp. 9459–9474
work page 2020
-
[35]
Retrieval-Augmented Generation for Large Language Models: A Survey
Y . Gaoet al., “Retrieval-augmented generation for large language models: A survey,”arXiv preprint arXiv:2312.10997, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[36]
Available: https://v0.dev/, 2025, accessed: Feb
Vercel, “v0,” [Online]. Available: https://v0.dev/, 2025, accessed: Feb. 8, 2026
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.