pith. machine review for the scientific record. sign in

arxiv: 2604.15468 · v2 · submitted 2026-04-16 · 💻 cs.SE · cs.AI

Recognition: unknown

The Semi-Executable Stack: Agentic Software Engineering and the Expanding Scope of SE

Authors on Pith no claims yet

Pith reviewed 2026-05-10 10:29 UTC · model grok-4.3

classification 💻 cs.SE cs.AI
keywords semi-executable artifactsagentic software engineeringAI-augmented software engineeringsemi-executable stackdiagnostic modelorganizational routinessoftware engineering scope
0
0 comments X

The pith

Software engineering is expanding from executable code to semi-executable artifacts that combine language, tools, workflows, and routines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper contends that AI agents and foundation models do not make software engineering obsolete but instead broaden the target of engineering work. What must now be built includes not only runnable programs but also instructional text, orchestration logic, control layers, and organizational patterns that execute only through human judgment or probabilistic outcomes. The Semi-Executable Stack supplies a six-layer reference model to locate any given contribution or friction point within this wider scope. The authors illustrate the model with three cases and supply a preserve-versus-purify rule for deciding which existing practices to retain. Readers who accept the framing gain a concrete way to map new agentic systems onto familiar engineering concerns rather than treating them as replacements.

Core claim

The important shift is not that software engineering loses relevance. It is that the thing being engineered expands beyond executable code to semi-executable artifacts; combinations of natural language, tools, workflows, control mechanisms, and organizational routines whose enactment depends on human or probabilistic interpretation rather than deterministic execution. The Semi-Executable Stack is introduced as a six-ring diagnostic reference model spanning executable artifacts, instructional artifacts, orchestrated execution, controls, operating logic, and societal and institutional fit. The model helps locate where a contribution, bottleneck, or organizational transition primarily sits, and

What carries the argument

The Semi-Executable Stack, a six-ring diagnostic reference model whose rings are executable artifacts, instructional artifacts, orchestrated execution, controls, operating logic, and societal and institutional fit.

If this is right

  • Any contribution or bottleneck can be assigned to one primary ring while noting its dependencies on adjacent rings.
  • Familiar objections to agentic AI in software engineering become concrete engineering targets rather than reasons to reject the transition.
  • Legacy processes, controls, and coordination routines can be evaluated with the preserve-versus-purify heuristic to decide what to keep versus simplify.
  • Organizational transitions can be diagnosed by tracking which rings are changing and which remain stable.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The model could be tested by mapping existing agent frameworks onto the six rings to reveal gaps not yet addressed by current tools.
  • It offers a way to compare how different organizations adapt their routines when introducing agentic systems.
  • The preserve-versus-purify heuristic could generate measurable criteria for retaining or retiring specific SE practices.

Load-bearing premise

The six-ring Semi-Executable Stack supplies a sufficiently complete and actionable diagnostic lens for locating contributions, bottlenecks, and transitions across the expanded scope of software engineering.

What would settle it

A detailed case study of an AI-augmented software project in which the six rings cannot be used to identify the primary location of a major bottleneck or contribution would falsify the model's claimed utility.

Figures

Figures reproduced from arXiv: 2604.15468 by Dhasarathy Parthasarathy, Julian Frattini, Per Lenberg, Robert Feldt.

Figure 1
Figure 1. Figure 1: The Semi-Executable Stack. The rings represent a spectrum of engineering objects, [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
read the original abstract

AI-based systems, currently driven largely by LLMs and tool-using agentic harnesses, are increasingly discussed as a possible threat to software engineering. Foundation models get stronger, agents can plan and act across multiple steps, and tasks such as scaffolding, routine test generation, straightforward bug fixing, and small integration work look more exposed than they did only a few years ago. The result is visible unease not only among students and junior developers, but also among experienced practitioners who worry that hard-won expertise may lose value. This paper argues for a different reading. The important shift is not that software engineering loses relevance. It is that the thing being engineered expands beyond executable code to semi-executable artifacts; combinations of natural language, tools, workflows, control mechanisms, and organizational routines whose enactment depends on human or probabilistic interpretation rather than deterministic execution. The Semi-Executable Stack is introduced as a six-ring diagnostic reference model for reasoning about that expansion, spanning executable artifacts, instructional artifacts, orchestrated execution, controls, operating logic, and societal and institutional fit. The model helps locate where a contribution, bottleneck, or organizational transition primarily sits, and which adjacent rings it depends on. The paper develops the argument through three worked cases, reframes familiar objections as engineering targets rather than reasons to dismiss the transition, and closes with a preserve-versus-purify heuristic for deciding which legacy software engineering processes, controls, and coordination routines should be kept and which should be simplified or redesigned. This paper is a conceptual keynote companion: diagnostic and agenda-setting rather than empirical.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper claims that AI-driven agentic systems do not threaten software engineering but expand its scope beyond executable code to semi-executable artifacts (natural language, tools, workflows, controls, and organizational routines reliant on human or probabilistic interpretation). It introduces the Semi-Executable Stack as a six-ring diagnostic reference model (executable artifacts, instructional artifacts, orchestrated execution, controls, operating logic, and societal/institutional fit) to locate contributions and bottlenecks, develops the reframing via three worked cases, treats familiar objections as engineering targets, and proposes a preserve-versus-purify heuristic for legacy processes.

Significance. If the reframing holds, the paper supplies a useful conceptual tool for the SE community to map the shifting boundaries of the field amid agentic AI. Its diagnostic model and heuristic could help researchers and practitioners identify where new work fits in the expanded stack and decide what legacy elements to retain or redesign. The explicitly agenda-setting, non-empirical stance is a strength, as it avoids overclaiming while offering a structured lens for future contributions.

minor comments (2)
  1. The abstract lists the six rings but does not provide even one-sentence characterizations of each; adding brief definitions would improve immediate accessibility without altering the conceptual focus.
  2. The three worked cases are referenced as illustrations of the model, but the manuscript would benefit from an explicit mapping table or paragraph showing which rings each case primarily engages; this is a clarity issue rather than a challenge to the central claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of the manuscript and for the recommendation of minor revision. The summary accurately reflects the paper's positioning as a conceptual, agenda-setting contribution that introduces the Semi-Executable Stack as a diagnostic model for the expanding scope of software engineering amid agentic AI systems. We appreciate the recognition of its potential value to researchers and practitioners in locating contributions and bottlenecks within the six-ring structure, as well as the preserve-versus-purify heuristic for legacy processes. We will make any minor editorial adjustments in the revised version.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is explicitly conceptual and agenda-setting. It introduces the six-ring Semi-Executable Stack as an independent diagnostic reference model, develops the reframing through three worked cases, and offers a preserve-versus-purify heuristic. No equations, fitted parameters, predictions, or derivations are present that could reduce to inputs by construction. The central claim concerns an expansion of scope rather than a mechanism derived from its own assumptions or self-citations, rendering the argument self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper is conceptual and introduces a new diagnostic model without empirical data or formal proof in the abstract; it rests on the domain assumption that AI capabilities are expanding SE tasks and postulates the six-ring stack as the organizing lens.

axioms (1)
  • domain assumption AI-based systems driven by LLMs and agentic harnesses are increasingly capable of tasks such as scaffolding, routine test generation, and bug fixing that were previously the domain of software engineers.
    Stated as the premise creating visible unease among practitioners and students.
invented entities (1)
  • Semi-Executable Stack no independent evidence
    purpose: Six-ring diagnostic reference model spanning executable artifacts, instructional artifacts, orchestrated execution, controls, operating logic, and societal/institutional fit.
    Newly proposed framework for locating contributions and bottlenecks in the expanded SE scope.

pith-pipeline@v0.9.0 · 5590 in / 1406 out tokens · 49136 ms · 2026-05-10T10:29:45.188779+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

49 extracted references · 27 canonical work pages

  1. [1]

    Transitioning manual system test suites to automated testing: An industrial case study

    Emil Alégroth, Robert Feldt, and Helena Holmström Olsson. Transitioning manual system test suites to automated testing: An industrial case study. In2013 IEEE Sixth International Conference on Software Testing, Verification and Validation, pages 56–65, 2013. doi: 10.1109/ICST.2013.14

  2. [2]

    Anthropic economic index, january 2026 report, 2026

    Anthropic. Anthropic economic index, january 2026 report, 2026. URLhttps://www.an thropic.com/research/anthropic-economic-index-january-2026-report . Accessed April 22, 2026

  3. [3]

    Baxter and Ian Sommerville

    Gordon D. Baxter and Ian Sommerville. Socio-technical systems: From design methods to systems engineering.Interacting with Computers, 23(1):4–17, 2011

  4. [4]

    Measuring the impact of early-2025 AI on experienced open-source developer productivity, 2025

    Joel Becker, Nate Rush, Elizabeth Barnes, and David Rein. Measuring the impact of early-2025 AI on experienced open-source developer productivity, 2025. URL https: //metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/ . Model Evaluation and Threat Research (METR). Published July 10, 2025; Accessed April 22, 2026

  5. [5]

    Brooks, Jr

    Frederick P. Brooks, Jr. No silver bullet: Essence and accidents of software engineering. Computer, 20(4):10–19, 1987. doi: 10.1109/MC.1987.1663532

  6. [6]

    Canaries in the coal mine? six facts about the recent employment effects of artificial intelligence, 2025

    Erik Brynjolfsson, Bharat Chandar, and Ruyu Chen. Canaries in the coal mine? six facts about the recent employment effects of artificial intelligence, 2025. URLhttps: //digitaleconomy.stanford.edu/publication/canaries-in-the-coal-mine-six -facts-about-the-recent-employment-effects-of-artificial-intelligence/ . Published November 13, 2025; Accessed April 22, 2026

  7. [7]

    AI in software engineering at google: Progress and the path ahead, 2024

    Satish Chandra and Maxim Tabachnyk. AI in software engineering at google: Progress and the path ahead, 2024. URLhttps://research.google/blog/ai-in-software-enginee ring-at-google-progress-and-the-path-ahead/ . Published June 6, 2024; Accessed April 22, 2026

  8. [8]

    Addison-Wesley Professional, 3rd edition, 2011

    Mary Beth Chrissis, Mike Konrad, and Sandy Shrum.CMMI for Development: Guidelines for Process Integration and Product Improvement. Addison-Wesley Professional, 3rd edition, 2011. 16

  9. [9]

    2025 state of AI-assisted software development report, 2025

    Derek DeBellis, Kevin Storer, Nathen Harvey, Matt Beane, Rob Edwards, Edward Fraser, Ben Good, Eirini Kalliamvakou, Gene Kim, Eric Maxwell, Sarah D’Angelo, Sarah Inman, Ambar Murillo, and Daniella Villalba. 2025 state of AI-assisted software development report, 2025. URL https://research.google/pubs/dora-2025-state-of-ai-assiste d-software-development-rep...

  10. [10]

    The cybernetic teammate: A field experiment on generative AI reshaping teamwork and expertise, 2025

    Fabrizio Dell’Acqua, Charles Ayoubi, Hila Lifshitz, Raffaella Sadun, Ethan Mollick, Lilach Mollick, Yi Han, Jeff Goldman, Hari Nair, Stewart Taub, and Karim Lakhani. The cybernetic teammate: A field experiment on generative AI reshaping teamwork and expertise, 2025. URLhttps://www.nber.org/papers/w33641. NBER Working Paper 33641

  11. [11]

    Challenges in testing large language model based software: A faceted taxonomy.ACM Transactions on Software Engineering and Methodology, 35(4):1–38, 2026

    Felix Dobslaw, Robert Feldt, Juyeon Yoon, and Shin Yoo. Challenges in testing large language model based software: A faceted taxonomy.ACM Transactions on Software Engineering and Methodology, 35(4):1–38, 2026. doi: 10.1145/3806396

  12. [12]

    Regulation (eu) 2024/1689 laying down harmonised rules on artificial intelligence (AI act), 2024

    European Union. Regulation (eu) 2024/1689 laying down harmonised rules on artificial intelligence (AI act), 2024. URLhttps://eur-lex.europa.eu/legal-content/en/LSU/ ?qid=1744648637579&uri=CELEX%3A32024R1689 . Official summary and legal reference; Accessed April 22, 2026

  13. [13]

    Cognition in software engineering: A taxonomy and survey of a half-century of research.ACM Computing Surveys, 54(11s): 1–36, 2022

    Fabian Fagerholm, Michael Felderer, Davide Fucci, Michael Unterkalmsteiner, Bogdan Marculescu, Markus Martini, Lucas Gren, Lars Göran Wallgren Tengberg, Robert Feldt, Antti Lehtelä, Bettina Nagyváradi, and Jehan Khattak. Cognition in software engineering: A taxonomy and survey of a half-century of research.ACM Computing Surveys, 54(11s): 1–36, 2022. doi: ...

  14. [14]

    Angela Fan, Beliz Gokkaya, Mark Harman, Mitya Lyubarskiy, Shubho Sengupta, Shin Yoo, and Jie M. Zhang. Large language models for software engineering: Survey and open problems. InProceedings of the 45th IEEE/ACM International Conference on Software Engineering: Future of Software Engineering, pages 31–53, 2023. doi: 10.1109/ICSE-FoSE5 9343.2023.00008

  15. [15]

    Ways of applying artificial intelligence in software engineering

    Robert Feldt, Francisco Gomes de Oliveira Neto, and Richard Torkar. Ways of applying artificial intelligence in software engineering. InProceedings of the 6th International Work- shop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE@ICSE), pages 35–41, 2019

  16. [16]

    Towards autonomous testing agents via conversational large language models

    Robert Feldt, Sungmin Kang, Juyeon Yoon, and Shin Yoo. Towards autonomous testing agents via conversational large language models. InProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 1688–1693,

  17. [17]

    doi: 10.1109/ASE56229.2023.00148

  18. [18]

    A software engineering perspective on engineering machine learning systems: State of the art and challenges.Journal of Systems and Software, 180:111031, 2021

    Görkem Giray. A software engineering perspective on engineering machine learning systems: State of the art and challenges.Journal of Systems and Software, 180:111031, 2021. doi: 10.1016/j.jss.2021.111031

  19. [19]

    Wobbrock, and Katharina Reinecke

    Daniel Graziotin, Per Lenberg, Robert Feldt, and Stefan Wagner. Psychometrics in behavioral software engineering: A methodological introduction with guidelines.ACM Transactions on Software Engineering and Methodology, 31(1):1–36, 2022. doi: 10.1145/34 69888

  20. [20]

    Cross-functional AI task forces (X-FAITs) for AI transfor- mation of software organizations

    Lucas Gren and Robert Feldt. Cross-functional AI task forces (X-FAITs) for AI transfor- mation of software organizations. InProceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering (EASE), pages 793–796, 2025. doi: 10.1145/3756681.3757015. 17

  21. [21]

    Large Language Models for Software Engineering: A Sys- tematic Literature Review,

    Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Daniel Luo, Xiapu Luo, David Lo, John Grundy, and Haoyu Wang. Large language models for software engineering: A systematic literature review.ACM Transactions on Software Engineering and Methodology, 33(8):1–79, 2024. doi: 10.1145/3695988

  22. [22]

    Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik R

    Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik R. Narasimhan. SWE-bench: Can language models resolve real-world github issues? InThe Twelfth International Conference on Learning Representations (ICLR), 2024. URL https://openreview.net/forum?id=VTF8yNQM66. ICLR 2024; Accessed April 22, 2026

  23. [23]

    Gatelens: A reasoning-enhanced LLM agent for automotive software release analytics, 2025

    Arsham Gholamzadeh Khoee, Shuai Wang, Robert Feldt, Dhasarathy Parthasarathy, and Yinan Yu. Gatelens: A reasoning-enhanced LLM agent for automotive software release analytics, 2025. URL https://arxiv.org/abs/2503.21735. arXiv:2503.21735; Accessed April 22, 2026

  24. [24]

    Gonogo: AnefficientLLM-basedmulti-agentsystem for streamlining automotive software release decision-making

    Arsham Gholamzadeh Khoee, Yinan Yu, Robert Feldt, Andris Freimanis, Patrick Andersson Rhodin, andDhasarathyParthasarathy. Gonogo: AnefficientLLM-basedmulti-agentsystem for streamlining automotive software release decision-making. InProceedings of the 36th IFIP WG 6.1 International Conference on Testing Software and Systems (ICTSS), volume 15383 of Lecture...

  25. [25]

    Amy J. Ko, Robin Abraham, Laura Beckwith, Alan Blackwell, Margaret Burnett, Martin Erwig, Chris Scaffidi, Joseph Lawrance, Henry Lieberman, Brad Myers, Mary Beth Rosson, Gregg Rothermel, Mary Shaw, and Susan Wiedenbeck. The state of the art in end-user software engineering.ACM Computing Surveys, 43(3):1–44, 2011. doi: 10.1145/1922649.19 22658

  26. [26]

    Machine learning operations (MLOps): Overview, definition, and architecture, 2023

    Dominik Kreuzberger, Niklas Kühl, and Sebastian Hirschl. Machine learning operations (MLOps): Overview, definition, and architecture, 2023. URLhttps://arxiv.org/abs/22 05.02302. arXiv:2205.02302; Accessed April 22, 2026

  27. [27]

    arXiv preprint arXiv:2503.14499 , year=

    Thomas Kwa, Ben West, Joel Becker, Amy Deng, Katharyn Garcia, Max Hasin, Sami Jawhar, Megan Kinniment, Nate Rush, Sydney von Arx, Ryan Bloom, Thomas Broadley, Haoxing Du, Brian Goodrich, Nikola Jurkovic, Luke Harold Miles, Seraphina Nix, Tao Lin, Neev Parikh, David Rein, Lucas Jun Koba Sato, Hjalmar Wijk, Daniel M. Ziegler, Elizabeth Barnes, and Lawrence ...

  28. [28]

    Behavioral software engineering: A definition and systematic literature review.Journal of Systems and Software, 107:15–37,

    Per Lenberg, Robert Feldt, and Lars Göran Wallgren. Behavioral software engineering: A definition and systematic literature review.Journal of Systems and Software, 107:15–37,

  29. [29]

    doi: 10.1016/j.jss.2015.04.084

  30. [30]

    Human factors related challenges in software engineering: An industrial perspective

    Per Lenberg, Robert Feldt, and Lars Göran Wallgren. Human factors related challenges in software engineering: An industrial perspective. In2015 IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE), pages 43–49, 2015. doi: 10.1109/CHASE.2015.13

  31. [31]

    Manning, Christo- pher Ré, et al

    Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Ya- sunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christo- pher Ré, et al. Holistic evaluation of language models.Transactions on Machine Learning Research, 2023....

  32. [32]

    Springer, 2006

    Henry Lieberman, Fabio Paternò, and Volker Wulf, editors.End User Development, volume 9 ofHuman-Computer Interaction Series. Springer, 2006

  33. [33]

    Large language model-based agents for software engineering: A survey.arXiv preprint arXiv:2409.02977, 2024

    Junwei Liu, Kaixin Wang, Yixuan Chen, Xin Peng, Zhenpeng Chen, Lingming Zhang, and Yiling Lou. Large language model-based agents for software engineering: A survey, 2024. URLhttps://arxiv.org/abs/2409.02977. arXiv:2409.02977; Accessed April 22, 2026

  34. [34]

    Software engineering for AI-based systems: A survey.ACM Transactions on Software Engineering and Methodology, 31(2), 2022

    Silverio Martínez-Fernández, Justus Bogner, Xavier Franch, Marc Oriol, Julien Siebert, Adam Trendowicz, Anna Maria Vollmer, and Stefan Wagner. Software engineering for AI-based systems: A survey.ACM Transactions on Software Engineering and Methodology, 31(2), 2022. doi: 10.1145/3487043

  35. [35]

    2025: The year the frontier firm is born, 2025

    Microsoft WorkLab. 2025: The year the frontier firm is born, 2025. URLhttps://www.mi crosoft.com/en-us/worklab/work-trend-index/2025-the-year-the-frontier-fir m-is-born. Published April 29, 2025; Accessed April 22, 2026

  36. [36]

    ACM TOSEM32, 3, Article 75 (April 2023), 30 pages

    Michel Nass, Emil Alégroth, Robert Feldt, Maurizio Leotta, and Filippo Ricca. Similarity- based web element localization for robust test automation.ACM Transactions on Software Engineering and Methodology, 32(3):1–30, 2023. doi: 10.1145/3571855

  37. [37]

    AI risk management framework, 2023

    National Institute of Standards and Technology. AI risk management framework, 2023. URL https://www.nist.gov/itl/ai-risk-management-framework. Accessed April 22, 2026

  38. [38]

    Artificial intelligence risk management framework: Generative artificial intelligence profile, 2024

    National Institute of Standards and Technology. Artificial intelligence risk management framework: Generative artificial intelligence profile, 2024. URLhttps://www.nist.gov/p ublications/artificial-intelligence-risk-management-framework-generative-a rtificial-intelligence. Published July 26, 2024; Accessed April 22, 2026

  39. [39]

    An empirical study on decision-making aspects in responsible software engineering for AI

    Lekshmi Murali Rani, Faezeh Mohammadi, Robert Feldt, and Richard Berntsson Svensson. An empirical study on decision-making aspects in responsible software engineering for AI. In2025 IEEE/ACM 47th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pages 575–585. IEEE, 2025. doi: 10.1109/ICSE-SEI P66354.2025.00056

  40. [40]

    Bridging the socio- emotional gap: The functional dimension of human-AI collaboration for software engineering,

    Lekshmi Murali Rani, Richard Berntsson Svensson, and Robert Feldt. Bridging the socio- emotional gap: The functional dimension of human-AI collaboration for software engineering,

  41. [41]

    arXiv:2601.19387; Accessed April 22, 2026

    URL https://arxiv.org/abs/2601.19387. arXiv:2601.19387; Accessed April 22, 2026

  42. [42]

    Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-François Crespo, and Dan Dennison

    D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-François Crespo, and Dan Dennison. Hidden technical debt in machine learning systems. InAdvances in Neural Information Processing Systems (NeurIPS), pages 2503–2511, 2015

  43. [43]

    The unfulfilled potential of data-driven decision making in agile software development

    Richard Berntsson Svensson, Robert Feldt, and Richard Torkar. The unfulfilled potential of data-driven decision making in agile software development. InAgile Processes in Software Engineering and Extreme Programming, volume355ofLecture Notes in Business Information Processing, pages 69–85. Springer, 2019. doi: 10.1007/978-3-030-19034-7_5

  44. [44]

    Vaccargiu, S

    Mahan Tafreshipour, Aaron Imani, Eric Huang, Eduardo Santana de Almeida, Thomas Zimmermann, and Iftekhar Ahmed. Prompting in the wild: An empirical study of prompt evolution in software repositories. InProceedings of the 22nd IEEE/ACM International Conference on Mining Software Repositories (MSR), 2025. doi: 10.1109/MSR66628.2025.00

  45. [45]

    URL https://2025.msrconf.org/details/msr-2025-technical-papers/10/Pro 19 mpting-in-the-Wild-An-Empirical-Study-of-Prompt-Evolution-in-Software-Rep ositorie

  46. [46]

    Unpacking organizational change in AI transfor- mations of software engineering

    Theocharis Tavantzis and Robert Feldt. Unpacking organizational change in AI transfor- mations of software engineering. InProceedings of the 18th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE@ICSE), pages 149–160,

  47. [47]

    doi: 10.1109/CHASE66643.2025.00026

  48. [48]

    Devanbu, and Michael Pradel

    Shuai Wang, Yinan Yu, Robert Feldt, and Dhasarathy Parthasarathy. Automating a complete software test process using LLMs: An automotive case study. InProceedings of the 47th IEEE/ACM International Conference on Software Engineering (ICSE), pages 373–384, 2025. doi: 10.1109/ICSE55347.2025.00211

  49. [49]

    The github recent bugs dataset for evaluating llm-based debugging applications,

    Juyeon Yoon, Robert Feldt, and Shin Yoo. Intent-driven mobile GUI testing with au- tonomous large language model agents. In2024 IEEE Conference on Software Testing, Ver- ification and Validation (ICST), pages 129–139, 2024. doi: 10.1109/ICST60714.2024.00020. A Keynote Title and Abstract Title.Agentic Software Engineering Will Eat the World: AI-Based Syste...