Recognition: 2 theorem links
· Lean TheoremFrom Assistance to Agency: Rethinking Autonomy and Control in CI/CD Pipelines
Pith reviewed 2026-05-11 02:02 UTC · model grok-4.3
The pith
The central challenge in agentic CI/CD is designing authority transfer from humans to agents rather than improving task performance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper presents a vision of agentic CI/CD in which the central challenge is not improving task performance but designing authority transfer, defined as the delegation of operational decisions from human-controlled pipelines to agent systems under specified constraints and recourse mechanisms. Drawing on research prototypes and industrial platforms, it shows that current systems operate mainly at the data plane under bounded autonomy, with safety achieved through surrounding governance infrastructure rather than intrinsic agent guarantees.
What carries the argument
The distinction between data-plane authority for localized interventions such as patch generation and test reruns and control-plane authority for modifications to pipeline configuration, deployment policies, and approval gates, which structures the analysis of how much decision power is delegated.
If this is right
- Current systems achieve safety through external governance rather than built-in agent guarantees.
- Three recurring patterns appear across platforms: constrained autonomy as the dominant design, external governance as the primary safety mechanism, and a widening gap between deployment momentum and evaluation methodology.
- Control-plane safety and governance mechanisms represent the most urgent open problem.
- Subsequent priorities include formalization of autonomy boundaries, new evaluation frameworks, and protocols for human-agent coordination.
Where Pith is reading between the lines
- If the data-plane and control-plane split holds, it could be used to design staged systems where agents first handle data-plane actions before any control-plane proposals reach human review.
- The same authority-transfer lens might apply to other software automation domains such as infrastructure provisioning or monitoring alert handling.
- A broader survey across additional industrial tools could strengthen or refine the observation that most agents stay at the data plane.
Load-bearing premise
That the distinction between data-plane and control-plane authority is a useful and natural way to analyze autonomy in CI/CD, and that observations from research prototypes and industrial platforms suffice to establish that current systems are limited to data-plane operations with external safety mechanisms.
What would settle it
Discovery of a production CI/CD system in which an agent can independently modify approval gates or deployment policies without any external governance layer or human recourse would directly test the claim that systems remain confined to data-plane authority.
read the original abstract
AI agents are assuming active roles in Continuous Integration and Continuous Deployment (CI/CD) workflows, yet the research community lacks a shared vocabulary for describing what it means for CI/CD to be agentic, how much decision authority is delegated, and where control should reside. This paper presents a vision of agentic CI/CD in which the central challenge is not improving task performance but designing authority transfer, defined as the delegation of operational decisions from human-controlled pipelines to agent systems under specified constraints and recourse mechanisms. To structure this argument, we introduce a distinction between data-plane authority (localized interventions such as patch generation and test reruns) and control-plane authority (modifications to pipeline configuration, deployment policies, and approval gates). Drawing on research prototypes and industrial platforms, we show that current systems operate mainly at the data plane under bounded autonomy, with safety achieved through surrounding governance infrastructure rather than intrinsic agent guarantees. We identify three recurring patterns: constrained autonomy as the dominant design, external governance as the primary safety mechanism, and a widening gap between deployment momentum and evaluation methodology. We propose a research agenda in which control-plane safety and governance mechanisms represent the most urgent open problem, followed by formalization of autonomy boundaries, evaluation frameworks, and human--agent coordination.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that the central challenge in agentic CI/CD is not task performance but designing authority transfer—the delegation of operational decisions from human-controlled pipelines to agent systems under specified constraints and recourse mechanisms. It introduces a distinction between data-plane authority (localized interventions such as patch generation and test reruns) and control-plane authority (modifications to pipeline configuration, deployment policies, and approval gates). Drawing on research prototypes and industrial platforms, it asserts that current systems operate mainly at the data plane under bounded autonomy with safety achieved through external governance rather than intrinsic guarantees, identifies three recurring patterns, and proposes a research agenda prioritizing control-plane safety, formalization of autonomy boundaries, evaluation frameworks, and human-agent coordination.
Significance. If the proposed framework holds, it could provide a useful conceptual lens for analyzing autonomy in AI-augmented DevOps workflows and help structure discussions around governance gaps. The emphasis on authority transfer rather than performance metrics may stimulate targeted research on safety mechanisms and human-AI coordination in software engineering. Its influence will depend on whether the data/control-plane distinction proves operationalizable and is validated through subsequent empirical work.
major comments (2)
- [Abstract] The assertion that 'current systems operate mainly at the data plane' (Abstract) is load-bearing for the central argument yet rests on unspecified observations from 'research prototypes and industrial platforms' without providing explicit criteria, a decision procedure, or an enumerated list of examined systems for classifying capabilities as data-plane versus control-plane. This leaves the generalization that existing tools are confined to localized interventions unverifiable and sensitive to how the planes are drawn.
- [Discussion of current systems and patterns] The identification of the three recurring patterns (constrained autonomy as dominant, external governance as primary safety mechanism, widening gap between deployment and evaluation) is presented without detailed mappings, case studies, or references to specific prototypes, which undermines the foundation for the subsequent research agenda that treats these patterns as established.
minor comments (1)
- [Terminology introduction] The terms 'data-plane authority' and 'control-plane authority' are introduced without referencing analogous distinctions from networking or distributed systems literature, which could help readers understand the intended analogy and scope.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which identify opportunities to improve the verifiability of our claims in this vision paper. We address each major comment point by point below, indicating where revisions will be made to strengthen the manuscript without changing its conceptual focus.
read point-by-point responses
-
Referee: [Abstract] The assertion that 'current systems operate mainly at the data plane' (Abstract) is load-bearing for the central argument yet rests on unspecified observations from 'research prototypes and industrial platforms' without providing explicit criteria, a decision procedure, or an enumerated list of examined systems for classifying capabilities as data-plane versus control-plane. This leaves the generalization that existing tools are confined to localized interventions unverifiable and sensitive to how the planes are drawn.
Authors: We agree that the abstract claim would benefit from greater transparency. The distinction between data-plane and control-plane authority is defined in Section 2 of the manuscript, and the generalization draws from the specific prototypes and platforms analyzed in Sections 3 and 4. To make the classification process explicit and verifiable, we will add a new subsection (or appendix) that enumerates the examined systems, states the decision criteria (whether a system can alter pipeline configuration, policies, or gates versus performing only localized actions), and provides a brief mapping for each. This addition will not expand the paper's scope but will allow readers to assess the generalization directly. revision: yes
-
Referee: [Discussion of current systems and patterns] The identification of the three recurring patterns (constrained autonomy as dominant, external governance as primary safety mechanism, widening gap between deployment and evaluation) is presented without detailed mappings, case studies, or references to specific prototypes, which undermines the foundation for the subsequent research agenda that treats these patterns as established.
Authors: The three patterns are synthesized from the concrete examples already referenced in the manuscript (e.g., the research prototypes and industrial platforms discussed in Sections 3–4). We acknowledge that the presentation would be stronger with explicit linkages. In revision we will insert a concise table or bulleted mapping that associates each pattern with one or more specific systems or citations, thereby grounding the patterns without converting the paper into an empirical survey. This will directly support the research agenda that follows. revision: yes
Circularity Check
No significant circularity in conceptual analysis of CI/CD autonomy
full rationale
The paper introduces definitions for authority transfer and the data-plane versus control-plane distinction explicitly to frame its vision, then presents the observation that current systems are limited to data-plane operations as drawn from external prototypes and platforms. No equations, fitted parameters, self-citations, or derivations are used that reduce any central claim to its own inputs by construction. The argument relies on new conceptual distinctions and high-level observations rather than self-referential logic, rendering the derivation self-contained.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Current AI-augmented CI/CD systems operate mainly at the data plane under bounded autonomy with safety provided by external governance.
invented entities (2)
-
data-plane authority
no independent evidence
-
control-plane authority
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclearWe introduce a distinction between data-plane authority (localized interventions such as patch generation and test reruns) and control-plane authority (modifications to pipeline configuration, deployment policies, and approval gates).
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclearSafety is achieved primarily through surrounding governance infrastructure rather than intrinsic agent guarantees.
Reference graph
Works this paper leans on
-
[1]
Bram Adams and Shane McIntosh. 2016. Modern Release Engineering in a Nutshell: Why Researchers Should Care. In2016 IEEE 23rd International 4 Conference on Software Analysis, Evolution, and Reengineering (SANER). 78–90. doi:10.1109/SANER.2016.18
-
[2]
Buthayna AlMulla, Maram Assi, and Safwat Hassan. 2025. Understanding the Challenges and Opportunities of Generative AI Apps: An Empirical Study.arXiv preprint arXiv:2506.16453(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[3]
Amazon Web Services. 2026. Third-party integration with Amazon Q Developer (GitHub). AWS Documentation. https://docs.aws.amazon.com/amazonq/latest/ qdeveloper-ug/third-party-integration.html Accessed 2026-02-07
work page 2026
- [4]
-
[5]
Benoit Baudry, Zimin Chen, Khashayar Etemadi, Han Fu, Davide Ginelli, Steve Kommrusch, Matias Martinez, Martin Monperrus, Javier Ron, He Ye, and Zhongx- ing Yu. 2021. A Software-Repair Robot Based on Continual Learning.IEEE Software38, 4 (2021), 28–35. doi:10.1109/MS.2021.3070743
-
[6]
Islem Bouzenia, Premkumar Devanbu, and Michael Pradel. 2025. RepairAgent: An Autonomous, LLM-Based Agent for Program Repair. In2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). IEEE Computer Society, Los Alamitos, CA, USA, 2188–2200. doi:10.1109/ICSE55347.2025.00157
-
[7]
Islem Bouzenia and Michael Pradel. 2025. You Name It, I Run It: An LLM Agent to Execute Tests of Arbitrary Projects.Proc. ACM Softw. Eng.2, ISSTA, Article ISSTA047 (June 2025), 23 pages. doi:10.1145/3728922
-
[8]
Yinfang Chen, Manish Shetty, Gagan Somashekar, Minghua Ma, Yogesh Simmhan, Jonathan Mace, Chetan Bansal, Rujia Wang, and Saravan Rajmohan. 2025. AIOp- sLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds. InProceedings of Machine Learning and Systems, Vol. 7
work page 2025
-
[9]
Betty H. C. Cheng, Rogério de Lemos, Holger Giese, Paola Inverardi, Jeff Magee, Jesper Andersson, Basil Becker, Nelly Bencomo, Yuriy Brun, Bojan Cukic, Gio- vanna Di Marzo Serugendo, Schahram Dustdar, Anthony Finkelstein, Cristina Gacek, Kurt Geihs, Vincenzo Grassi, Gabor Karsai, Holger M. Kienle, Jeff Kramer, Marin Litoiu, Sam Malek, Raffaela Mirandola, ...
-
[10]
InSoftware Engineering for Self-Adaptive Systems, Betty H
Software Engineering for Self-Adaptive Systems: A Research Roadmap. InSoftware Engineering for Self-Adaptive Systems, Betty H. C. Cheng, Rogério de Lemos, Holger Giese, Paola Inverardi, and Jeff Magee (Eds.). Lecture Notes in Computer Science, Vol. 5525. Springer, 1–26. doi:10.1007/978-3-642-02161-9_1
-
[11]
2025.Automate Your CI Fixes: Self-Healing Pipelines with AI Agents
Dagger. 2025.Automate Your CI Fixes: Self-Healing Pipelines with AI Agents. https: //dagger.io/blog/automate-your-ci-fixes-self-healing-pipelines-with-ai-agents
work page 2025
-
[12]
Datadog. 2026. Bits AI Dev Agent. https://docs.datadoghq.com/bits_ai/bits_ai_ dev_agent/. Accessed 2026-05-04
work page 2026
- [13]
-
[14]
Brian Fitzgerald and Klaas-Jan Stol. 2017. Continuous Software Engineering: A Roadmap and Agenda.Journal of Systems and Software123 (2017), 176–189. doi:10.1016/j.jss.2015.06.063
- [15]
-
[16]
Gitar. 2026. Automated build failure fix solutions (autonomous CI/CD healing engine). Vendor documentation / blog. https://cms.gitar.ai/automated-build- failure-fix-solutions/ Accessed 2026-02-07
work page 2026
-
[17]
2026.About GitHub Copilot coding agent
GitHub. 2026.About GitHub Copilot coding agent. https://docs.github.com/en/ copilot/concepts/agents/coding-agent/about-coding-agent Concept documenta- tion describing Copilot coding agent autonomy and PR-based delegation
work page 2026
-
[18]
GitHub. 2026. Continuous AI in Practice: What Developers Can Automate Today with Agentic CI. https://github.blog/ai-and-ml/generative-ai/continuous-ai-in- practice-what-developers-can-automate-today-with-agentic-ci/. GitHub Blog. Accessed 2026-02-11
work page 2026
-
[19]
GitLab. 2026. Fix CI/CD pipeline flow. GitLab Docs (Duo agent plat- form). https://docs.gitlab.com/user/duo_agent_platform/flows/foundational_ flows/fix_pipeline/ Accessed 2026-02-07
work page 2026
-
[20]
Google GitHub Actions. 2026. run-gemini-cli: A GitHub Action invoking the Gemini CLI. GitHub repository. https://github.com/google-github-actions/run- gemini-cli Accessed 2026-02-07
work page 2026
-
[21]
Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, and Haoyu Wang. 2024. Large Language Models for Software Engineering: A Systematic Literature Review.ACM Transactions on Software Engineering and Methodology33, 8 (2024), 220:1–220:79. doi:10.1145/ 3695988
work page 2024
-
[22]
Markus C. Huebscher and Julie A. McCann. 2008. A Survey of Autonomic Computing—Degrees, Models, and Applications.Comput. Surveys40, 3 (2008), 7:1–7:28. doi:10.1145/1380584.1380585
-
[23]
2010.Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation
Jez Humble and David Farley. 2010.Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation. Addison-Wesley Professional
work page 2010
-
[24]
Saurabh Jha, Rohan Arora, Yuji Watanabe, Takumi Yanagawa, Yinfang Chen, Jackson Clark, Bhavya Bhavya, Mudit Verma, Harshit Kumar, Hirokuni Ki- tahara, Noah Zheutlin, Saki Takano, Divya Pathak, Felix George, Xinbo Wu, Bekir O. Turkkan, Gerard Vanloo, Michael Nidd, Ting Dai, Oishik Chatter- jee, Pranjal Gupta, Suranjana Samanta, Pooja Aggarwal, Rong Lee, Pa...
-
[25]
Carlos E Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik R Narasimhan. 2024. SWE-bench: Can Language Models Resolve Real-world Github Issues?. InThe Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=VTF8yNQM66
work page 2024
-
[26]
Jeffrey O. Kephart and David M. Chess. 2003. The Vision of Autonomic Computing. Computer36, 1 (2003), 41–50. doi:10.1109/MC.2003.1160055
- [27]
- [28]
-
[29]
Alok Mishra and Ziadoon Otaiwi. 2020. DevOps and Software Quality: A System- atic Mapping.Computer Science Review38 (2020), 100308. doi:10.1016/j.cosrev. 2020.100308
-
[30]
Akshay Mittal and Vivek Venkatesan. 2025. Leveraging Generative AI for Proac- tive Security and Automated Remediation in Cloud-Native CI/CD Pipelines. In International Conference on Software Engineering and Data Engineering. Springer, 18–39
work page 2025
-
[31]
Martin Monperrus, Simon Urli, Thomas Durieux, Matias Martinez, Benoit Baudry, and Lionel Seinturier. 2019. Repairnator Patches Programs Automatically.Ubiq- uity2019, July (2019), 1–12. doi:10.1145/3349589
-
[32]
Paolo Notaro, Jorge Cardoso, and Michael Gerndt. 2021. A Survey of AIOps Methods for Failure Management.ACM Transactions on Intelligent Systems and Technology12, 6, Article 81 (Nov. 2021), 45 pages. doi:10.1145/3483424
-
[33]
2025.AI-Powered Self-Healing CI
Nx. 2025.AI-Powered Self-Healing CI. https://nx.dev/docs/features/ci-features/ self-healing-ci Nx Documentation. Accessed 2026-02-12
work page 2025
- [34]
-
[35]
Simon Urli, Zhongxing Yu, Lionel Seinturier, and Martin Monperrus. 2018. How to design a program repair bot? insights from the repairnator project. InProceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. ACM, 95–104. doi:10.1145/3183519.3183540
-
[36]
Yanlin Wang, Wanjun Zhong, Yanxian Huang, Ensheng Shi, Min Yang, Jiachi Chen, Hui Li, Yuchi Ma, Qianxiang Wang, and Zibin Zheng. 2025. Agents in soft- ware engineering: Survey, landscape, and vision.Automated Software Engineering 32, 2 (2025), 70
work page 2025
-
[37]
Steve R. White, James E. Hanson, Ian Whalley, David M. Chess, and Jeffrey O. Kephart. 2004. An Architectural Approach to Autonomic Computing. InProceed- ings of the 1st International Conference on Autonomic Computing (ICAC 2004). 2–9. doi:10.1109/ICAC.2004.8
-
[38]
Chunqiu Steven Xia and Lingming Zhang. 2024. Automated Program Repair via Conversation: Fixing 162 out of 337 Bugs for $0.42 Each using ChatGPT. InProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis(Vienna, Austria)(ISSTA 2024). Association for Computing Machinery, New York, NY, USA, 819–831. doi:10.1145/3650212.3680323
-
[39]
Weiyuan Xu, Juntao Luo, Tao Huang, Kaixin Sui, Jie Geng, Qijun Ma, Isami Akasaka, Xiaoxue Shi, Jing Tang, and Peng Cai. 2025. LogSage: An LLM-Based Framework for CI/CD Failure Detection and Remediation with Industrial Vali- dation. In2025 40th IEEE/ACM International Conference on Automated Software Engineering (ASE). 3742–3753. doi:10.1109/ASE63991.2025.00310
-
[40]
Chen Zhang, Bihuan Chen, Xin Peng, and Wenyun Zhao. 2022. BuildSheriff: change-aware test failure triage for continuous integration builds. InProceedings of the 44th International Conference on Software Engineering(Pittsburgh, Pennsyl- vania)(ICSE ’22). Association for Computing Machinery, New York, NY, USA, 312–324. doi:10.1145/3510003.3510132 5
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.