arxiv: 2602.14690 · v4 · submitted 2026-02-16 · 💻 cs.SE

Recognition: no theorem link

Configuring Agentic AI Coding Tools: An Exploratory Study

Matthias Galster , Seyedmoein Mohsenimofidi , Jai Lal Lulla , Muhammad Auwal Abubakar , Christoph Treude , Sebastian Baltes

Authors on Pith no claims yet

Pith reviewed 2026-05-15 22:05 UTC · model grok-4.3

classification 💻 cs.SE

keywords agentic AIAI coding toolsconfiguration mechanismscontext filesAGENTS.mdGitHub repositoriesempirical studysoftware development

0 comments

The pith

Context files dominate how developers configure agentic AI coding tools, with AGENTS.md emerging as an interoperable standard.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines the ways developers set up agentic AI coding tools such as Claude Code, GitHub Copilot, Cursor, Gemini, and Codex by looking at versioned files in GitHub repositories. It identifies eight configuration mechanisms ranging from static context files to executable scripts and external integrations. Analysis of 2,853 repositories shows that context files are by far the most common choice and frequently the only mechanism present. Advanced options like skills and subagents appear in very few projects, and skills themselves usually consist of static instructions rather than runnable code. These patterns create an empirical baseline for current practices and point to AGENTS.md as a practical entry point for configuration across tools.

Core claim

In an empirical study of 2,853 GitHub repositories, context files dominate the configuration landscape for agentic AI coding tools and are often the sole mechanism, with AGENTS.md emerging as an interoperable standard across tools. Few repositories adopt advanced mechanisms such as skills and subagents. Skills predominantly rely on static instructions rather than executable scripts. Distinct configuration practices are forming around different tools, with Claude Code users employing the broadest range of mechanisms.

What carries the argument

Context Files, including the AGENTS.md format, as the primary repository-level artifacts that supply instructions and context to agentic AI coding tools.

Load-bearing premise

The 2,853 GitHub repositories examined are representative of how developers typically configure these tools and that the eight identified mechanisms cover the main approaches in use.

What would settle it

A broader sample or developer survey revealing that most configuration happens through mechanisms outside the eight identified ones or that skills and subagents are adopted at high rates in typical projects.

Figures

Figures reproduced from arXiv: 2602.14690 by Christoph Treude, Jai Lal Lulla, Matthias Galster, Muhammad Auwal Abubakar, Sebastian Baltes, Seyedmoein Mohsenimofidi.

**Figure 1.** Figure 1: Data collection process. We consider configuration mechanisms that are captured in configuration artifacts to be consumed by agentic tools, are repositoryversioned for collaborative maintenance, and include instructions intended to customize the behavior of AI coding tools. With this paper, we provide the following contributions: (1) We systematically documented eight configuration mechanisms: Context Fi… view at source ↗

**Figure 2.** Figure 2: Adoption of agentic tools per programming lan [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Usage of configuration mechanisms across agen [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 5.** Figure 5: Configuration mechanism count per repository. [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Creation order of Context Files per repository. Curly braces indicate that files were added on the same day. 6 Details of Configuration Mechanisms (RQ3) We analyzed three mechanisms in detail: Context Files and Skills because they are supported by all tools, and Subagents because they follow a similar format as Skills. 6.1 Configuration Mechanism: Context Files Context Files are Markdown files that provide… view at source ↗

read the original abstract

Agentic AI coding tools increasingly automate software development tasks. Developers can configure these tools through versioned repository-level artifacts such as Markdown and JSON files. We present a systematic analysis of configuration mechanisms for agentic AI coding tools, covering Claude Code, GitHub Copilot, Cursor, Gemini, and Codex. We identify eight configuration mechanisms spanning from static context to executable and external integrations and, in an empirical study of 2,853 GitHub repositories, examine whether and how they are adopted, with a detailed analysis of Context Files, Skills, and Subagents. First, Context Files dominate the configuration landscape and are often the sole mechanism in a repository, with AGENTS$.$md emerging as an interoperable standard across tools. Second, few repositories adopt advanced mechanisms such as Skills and Subagents. Skills predominantly rely on static instructions rather than executable scripts. Third, distinct configuration practices are forming around different tools, with Claude Code users employing the broadest range of mechanisms. These findings establish an empirical baseline for understanding how developers configure agentic tools, suggest that AGENTS$.$md serves as a natural starting point, and motivate longitudinal and experimental research on how configuration strategies evolve and affect agent performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

First counts on how devs configure agentic AI tools show context files dominate and AGENTS.md spreading, but the repo sample may be biased toward those with configs already.

read the letter

This paper delivers the first broad empirical numbers on configuration practices for tools like Claude Code, Copilot, Cursor, Gemini, and Codex. In 2,853 GitHub repositories they map eight mechanisms and report that context files are the clear default, often the only one used, while AGENTS.md appears as an emerging cross-tool standard. Advanced options like Skills and Subagents see very low uptake, and Skills mostly stick to static instructions rather than code. Tool-specific differences show up too, with Claude users trying the widest mix.

Referee Report

3 major / 2 minor

Summary. The paper identifies eight configuration mechanisms for agentic AI coding tools (Claude Code, GitHub Copilot, Cursor, Gemini, Codex) spanning static context to executable integrations. In an empirical study of 2,853 GitHub repositories, it reports that Context Files dominate and are often the sole mechanism, with AGENTS.md emerging as an interoperable standard; few repositories adopt advanced mechanisms such as Skills and Subagents (which mostly use static instructions); and distinct tool-specific practices exist, with Claude Code users employing the broadest range. The work positions these findings as an empirical baseline motivating further longitudinal and experimental research.

Significance. If the sampling and classification are shown to be unbiased and representative, the study supplies a useful snapshot of current developer practices in configuring agentic coding tools. It usefully flags AGENTS.md as a potential de-facto standard and identifies under-adoption of more sophisticated mechanisms, thereby providing a concrete starting point for research on how configuration choices affect agent performance and for tool designers seeking interoperability.

major comments (3)

[Empirical study section] The description of how the 2,853 repositories were identified and sampled (search terms, inclusion/exclusion criteria, GitHub API or search filters) is absent or insufficiently detailed. This is load-bearing because the central claim that Context Files dominate and are often the sole mechanism could be an artifact of conditioning the corpus on the presence of the very filenames being measured.
[Methods / Identification of mechanisms] No details are supplied on how the eight mechanisms were systematically identified, how repositories were classified into mechanisms, inter-rater reliability, or any validation of the classification scheme. Without these, the reported adoption rates and the distinction between static vs. executable Skills cannot be assessed for reliability.
[Results on Context Files and adoption patterns] The quantitative statements about dominance (e.g., Context Files as sole mechanism in many repositories) lack accompanying counts, percentages, or breakdowns by tool that would allow readers to judge effect sizes and to verify that the patterns survive controls for sampling bias.

minor comments (2)

[Abstract] The abstract contains the typographical artifact 'AGENTS$.$md'; this should be rendered as AGENTS.md.
[Throughout] Mechanism names and tool names should be defined once with a table or glossary and then used consistently; several passages introduce slight variations in terminology that could confuse readers.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important areas where additional methodological transparency will strengthen the paper. We address each major comment below and will incorporate revisions in the next version of the manuscript.

read point-by-point responses

Referee: The description of how the 2,853 repositories were identified and sampled (search terms, inclusion/exclusion criteria, GitHub API or search filters) is absent or insufficiently detailed. This is load-bearing because the central claim that Context Files dominate and are often the sole mechanism could be an artifact of conditioning the corpus on the presence of the very filenames being measured.

Authors: We agree that the sampling procedure must be described in full detail. The 2,853 repositories were obtained via the GitHub Search API using queries for the presence of specific filenames (e.g., filename:AGENTS.md, filename:.cursorrules, filename:AGENT.md and equivalents for the other tools), restricted to public, non-forked repositories with at least one commit in the prior 12 months. Inclusion required at least one matching configuration file; we excluded archived repositories and those whose primary language was not a programming language. We will add a dedicated subsection to the Empirical Study section that lists the exact search strings, API parameters, total hits before filtering, deduplication steps, and inclusion/exclusion criteria. We will also explicitly discuss the sampling bias the referee correctly identifies as a limitation and note that our prevalence figures are conditional on the presence of at least one configuration artifact. revision: yes
Referee: No details are supplied on how the eight mechanisms were systematically identified, how repositories were classified into mechanisms, inter-rater reliability, or any validation of the classification scheme. Without these, the reported adoption rates and the distinction between static vs. executable Skills cannot be assessed for reliability.

Authors: The eight mechanisms were first enumerated by systematically reviewing the official documentation and configuration examples published by each tool vendor, followed by an exploratory scan of 50 high-star repositories to confirm the mechanisms in practice. Repository classification combined automated filename and content heuristics with manual review: two authors independently coded a stratified random sample of 200 repositories, reaching 91% raw agreement (Cohen’s κ = 0.87). Disagreements were resolved by joint discussion and the final coding rules were documented. For Skills we distinguished static instruction files from executable scripts by inspecting file extensions and content (presence of shebang lines or code blocks). We will insert a new Methods subsection that fully documents the identification process, the coding protocol, the inter-rater statistics, and the precise criteria used to separate static versus executable Skills. revision: yes
Referee: The quantitative statements about dominance (e.g., Context Files as sole mechanism in many repositories) lack accompanying counts, percentages, or breakdowns by tool that would allow readers to judge effect sizes and to verify that the patterns survive controls for sampling bias.

Authors: We will augment the Results section with a new table (and accompanying text) that reports exact counts and percentages for every mechanism, the proportion of repositories in which Context Files are the sole mechanism, and tool-specific breakdowns (e.g., percentage of Claude Code repositories using only Context Files versus those using additional mechanisms). We will also add a short discussion of how the observed patterns relate to the sampling frame. These additions will supply the numerical detail needed to assess effect sizes and will be accompanied by a limitations paragraph addressing sampling bias. revision: yes

Circularity Check

0 steps flagged

No circularity: purely observational empirical study with direct counts from repository data

full rationale

The paper performs an exploratory analysis by identifying eight configuration mechanisms and reporting their adoption frequencies across 2,853 GitHub repositories. All claims (e.g., dominance of Context Files, emergence of AGENTS.md) are direct empirical observations and qualitative summaries of the sampled artifacts. No equations, derivations, fitted parameters, or predictions exist that could reduce to inputs by construction. No self-citations serve as load-bearing uniqueness theorems or ansatzes. The analysis is self-contained against external benchmarks of repository inspection and does not invoke any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is observational and introduces no new mathematical parameters or postulated entities. It rests on the standard empirical software engineering assumption that a sample of public GitHub repositories reflects typical configuration behavior.

axioms (1)

domain assumption The sampled GitHub repositories reflect typical usage patterns of agentic AI coding tools
The study generalizes its findings on configuration dominance and adoption rates from the 2,853 repositories to broader developer practice.

pith-pipeline@v0.9.0 · 5528 in / 1329 out tokens · 30353 ms · 2026-05-15T22:05:49.897384+00:00 · methodology

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Dataset of Agentic AI Coding Tool Configurations
cs.SE 2026-05 accept novelty 8.0

A publicly released dataset of 15,591 configuration artifacts for five agentic AI coding tools, drawn from 4,738 GitHub repositories along with associated files and AI-co-authored commits.
Inside the Scaffold: A Source-Code Taxonomy of Coding Agent Architectures
cs.SE 2026-04 accept novelty 7.0

Analysis of 13 coding agent scaffolds at pinned commits yields a 12-dimension taxonomy showing five composable loop primitives, with 11 agents combining multiple primitives instead of using one fixed structure.
Beyond Human-Readable: Rethinking Software Engineering Conventions for the Agentic Development Era
cs.SE 2026-04 unverdicted novelty 6.0

Optimizing code for semantic density rather than human readability can improve agentic AI development efficiency, but aggressive compression of logs increased overall costs by shifting burden to reasoning.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages · cited by 3 Pith papers · 4 internal anchors

[1]

agentskills.io. 2026. Agent Skills. https://agentskills.io/

work page 2026
[2]

agentsmd community. 2025. AGENTS.md: A Simple, Open Format for Guiding Coding Agents. Website. https://agents.md/ Accessed 2026-01-18

work page 2025
[3]

Anthropic. 2025. Claude 3.7 Sonnet and Claude Code. https://www.anthropic. com/news/claude-3-7-sonnet

work page 2025
[4]

Anthropic. 2026. Create custom subagents. https://code.claude.com/docs/en/sub- agents

work page 2026
[5]

anthropics/claude-code on GitHub. 2026. Feature Request: Support AGENTS.md. https://github.com/anthropics/claude-code/issues/6235

work page 2026
[6]

Hassan, and Hajimu Iida

Worawalan Chatlatanagulchai, Hao Li, Yutaro Kashiwa, Brittany Reid, Kundjana- sith Thonglek, Pattara Leelaprute, Arnon Rungsawang, Bundit Manaskasemsak, Bram Adams, Ahmed E. Hassan, and Hajimu Iida. 2025. Agent READMEs: An Empirical Study of Context Files for Agentic Coding. arXiv:2511.12884 [cs.SE] doi:10.48550/arXiv.2511.12884

work page doi:10.48550/arxiv.2511.12884 2025
[7]

Cursor. 2026. Subagents. https://cursor.com/docs/context/subagents

work page 2026
[8]

Ozren Dabic, Emad Aghajani, and Gabriele Bavota. 2021. Sampling Projects in GitHub for MSR Studies. In18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021, Madrid, Spain, May 17-19, 2021. IEEE, Madrid, Spain, 560–564. doi:10.1109/MSR52588.2021.00074

work page doi:10.1109/msr52588.2021.00074 2021
[9]

DAIR.AI Prompt Engineering Guide. 2025. Elements of a Prompt | Prompt Engineering Guide. https://www.promptingguide.ai/introduction/elements

work page 2025
[10]

A collection of fully-annotated soundscape recordings from the western united states,

Matthias Galster, Seyedmoein Mohsenimofidi, Jai Lal Lulla, Muhammad Auwal Abubakar, Christoph Treude, and Sebastian Baltes. 2026.Configuring Agentic AI Coding Tools: An Exploratory Study (Supplementary Material). doi:10.5281/zenodo. 18625980

work page doi:10.5281/zenodo 2026
[11]

Hao He, Courtney Miller, Shyam Agarwal, Christian Kästner, and Bogdan Vasilescu. 2025. Speed at the Cost of Quality: How Cursor AI Increases Short-Term Velocity and Long-Term Complexity in Open-Source Projects. arXiv:2511.04427 [cs.SE] doi:10.48550/arXiv.2511.04427 To appear at the 23rd IEEE/ACM International Conference on Mining Software Repositories (MS...

work page doi:10.48550/arxiv.2511.04427 2025
[12]

Dexter Horthy. 2025. Getting AI to Work in Complex Codebases. https://github.com/humanlayer/advanced-context-engineering-for-coding- agents/blob/main/ace-fca.md

work page 2025
[13]

Shaokang Jiang and Daye Nam. 2025. Beyond the Prompt: An Empirical Study of Cursor Rules. arXiv:2512.18925 [cs.SE] doi:10.48550/arXiv.2512.18925 To appear at the 23rd IEEE/ACM International Conference on Mining Software Repositories (MSR 2026), Rio de Janeiro, Brazil

work page doi:10.48550/arxiv.2512.18925 2025
[14]

1904.01361

Jai Lal Lulla, Seyedmoein Mohsenimofidi, Matthias Galster, Jie M. Zhang, Sebas- tian Baltes, and Christoph Treude. 2026. On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents. arXiv:2601.20404 [cs.SE] doi:10.48550/arXiv. 2601.20404 To appear at the 1st Journal Ahead Workshop (JAWs@ICSE 2026)

work page internal anchor Pith review doi:10.48550/arxiv 2026
[15]

Damon McMillan. 2026. Structured Context Engineering for File-Native Agentic Systems: Evaluating Schema Accuracy, Format Effectiveness, and Multi-File Navigation at Scale. arXiv:2602.05447 [cs.SE] doi:10.48550/arXiv.2602.05447

work page doi:10.48550/arxiv.2602.05447 2026
[16]

Lingrui Mei, Jiayu Yao, Yuyao Ge, Yiwei Wang, Baolong Bi, Yujun Cai, Jiazhi Liu, Mingyu Li, Zhong-Zhi Li, Duzhen Zhang, Chenlin Zhou, Jiayi Mao, Tianze Xia, Jiafeng Guo, and Shenghua Liu. 2025. A Survey of Context Engineering for Large Language Models. arXiv:2507.13334 [cs.CL] doi:10.48550/arXiv.2507.13334

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2507.13334 2025
[17]

Seyedmoein Mohsenimofidi, Matthias Galster, Christoph Treude, and Sebastian Baltes. 2025. Context Engineering for AI Agents in Open-Source Software. arXiv:2510.21413 [cs.SE] doi:10.48550/arXiv.2510.21413 To appear at the 23rd IEEE/ACM International Conference on Mining Software Repositories (MSR 2026), Rio de Janeiro, Brazil

work page doi:10.48550/arxiv.2510.21413 2025
[18]

Nuthan Munaiah, Steven Kroh, Craig Cabrey, and Meiyappan Nagappan. 2017. Curating GitHub for engineered software projects.Empir. Softw. Eng.22, 6 (2017), 3219–3253. doi:10.1007/S10664-017-9512-6

work page doi:10.1007/s10664-017-9512-6 2017
[19]

OpenAI. 2025. Introducing Codex. https://openai.com/index/introducing-codex/

work page 2025
[20]

Per Runeson and Martin Höst. 2009. Guidelines for conducting and reporting case study research in software engineering.Empir. Softw. Eng.14, 2 (2009), 131–164. doi:10.1007/S10664-008-9102-8

work page doi:10.1007/s10664-008-9102-8 2009
[21]

Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Samrat Mondal, and Aman Chadha. 2024. A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications. arXiv:2402.07927 [cs.AI] doi:10. 48550/arXiv.2402.07927

work page internal anchor Pith review Pith/arXiv arXiv 2024
[22]

Santos, Vitor Costa, João Eduardo Montandon, and Marco Túlio Valente

Helio Victor F. Santos, Vitor Costa, João Eduardo Montandon, and Marco Túlio Valente. 2025. Decoding the Configuration of AI Coding Agents: Insights from Claude Code Projects. arXiv:2511.09268 [cs.SE] doi:10.48550/arXiv.2511.09268

work page doi:10.48550/arxiv.2511.09268 2025
[23]

Philipp Schmid. 2025. The New Skill in AI is Not Prompting, It’s Context Engi- neering. https://www.philschmid.de/context-engineering

work page 2025
[24]

SEART. 2025. GitHub Search. https://seart-ghs.si.usi.ch/

work page 2025
[25]

Agnia Sergeyuk, Yaroslav Golubev, Timofey Bryksin, and Iftekhar Ahmed. 2025. Using AI-based coding assistants in practice: State of affairs, perceptions, and ways forward.Information and Software Technology178 (2025), 107610. doi:10. 1016/j.infsof.2024.107610

work page arXiv 2025
[26]

Stack Exchange Inc. 2026. Stack Overflow Developer Survey 2025: AI Agent out-of-the-box tools. https://survey.stackoverflow.co/2025/ai/#3-ai-agent-out- of-the-box-tools

work page 2026
[27]

Valerio Terragni, Annie Vella, Partha Roop, and Kelly Blincoe. 2025. The Future of AI-Driven Software Engineering.ACM Trans. Softw. Eng. Methodol.34, 5, Article 120 (May 2025), 20 pages. doi:10.1145/3715003

work page doi:10.1145/3715003 2025
[28]

Hugo Villamizar, Jannik Fischbach, Alexander Korn, Andreas Vogelsang, and Daniel Méndez. 2025. Prompts as Software Engineering Artifacts: A Research Agenda and Preliminary Findings. InProduct-Focused Software Process Improve- ment - 26th International Conference, PROFES 2025, Salerno, Italy, December 1-3, 2025, Proceedings (Lecture Notes in Computer Scien...

work page doi:10.1007/978-3-032-12089-2_32 2025
[29]

Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press

John Yang, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. 2024. SWE-agent: Agent-Computer Inter- faces Enable Automated Software Engineering. InAdvances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Sys- tems 2024, NeurIPS 2024, Vancouver, BC, Canada, De...

work page doi:10.52202/079017-1601 2024
[30]

Haoran Ye, Xuning He, Vincent Arak, Haonan Dong, and Guojie Song. 2026. Meta Context Engineering via Agentic Skill Evolution. arXiv:2601.21557 [cs.AI] doi:10.48550/arXiv.2601.21557

work page doi:10.48550/arxiv.2601.21557 2026
[31]

Qizheng Zhang, Changran Hu, Shubhangi Upasani, Boyuan Ma, Fenglu Hong, Vamsidhar Kamanuru, Jay Rainton, Chen Wu, Mengmeng Ji, Hanchen Li, Urmish Thakker, James Zou, and Kunle Olukotun. 2025. Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models. arXiv:2510.04618 [cs.CL] doi:10.48550/arXiv.2510.04618

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2510.04618 2025
[32]

Yifan Zhang, Yang Yuan, Mengdi Wang, and Andrew Chi-Chih Yao. 2025. Monadic Context Engineering. arXiv:2512.22431 [cs.AI] doi:10.48550/arXiv.2512.22431

work page doi:10.48550/arxiv.2512.22431 2025