HECATE generates and validates ten complexity metrics (seven new) for LLM apps by treating prompts as behavioral specifications and filtering against maintenance activity from version history, showing prompt complexity as an independent factor.
AI-Generated Smells: An Analysis of Code and Architecture in LLM and Agent-Driven Development
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
The promise of Large Language Models in automated software engineering is often measured by functional correctness, overlooking the critical issue of long term maintainability. This paper presents a systematic audit of technical debt in AI-generated software, revealing that AI does not eliminate flaws but rather introduces a distinct machine signature of defects. Our multi-scale analysis, spanning single-file algorithmic tasks and complex, agent generated systems, identifies a fundamental Reasoning-Complexity Trade-off: as models become more capable, they generate increasingly bloated and coupled code. This architectural decay is so pronounced that we establish a Volume-Quality Inverse Law, where code volume is a near perfect predictor of structural degradation. Crucially, we demonstrate that neither functional correctness nor detailed prompting mitigates this decay. These findings challenge the current paradigm of prompt-driven generation, reframing the central problem of AI-based software engineering from one of code generation to one of architectural complexity management. We conclude that future progress depends on equipping agents with explicit architectural foresight to ensure the software they build is not just functional, but also maintainable.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
MicroSkill Architecture partitions knowledge into atomic skill capsules selected via constrained optimization to cut token use over 90% and improve code generation metrics in one enterprise case study.
citing papers explorer
-
Rethinking Complexity Metrics for LLM-Integrated Applications: Beyond Source Code
HECATE generates and validates ten complexity metrics (seven new) for LLM apps by treating prompts as behavioral specifications and filtering against maintenance activity from version history, showing prompt complexity as an independent factor.
-
Microskill Architecture: A Modular Skill-Driven Framework for AI-Native Code Generation
MicroSkill Architecture partitions knowledge into atomic skill capsules selected via constrained optimization to cut token use over 90% and improve code generation metrics in one enterprise case study.