arxiv: 2604.15468 · v2 · submitted 2026-04-16 · 💻 cs.SE · cs.AI

Recognition: unknown

The Semi-Executable Stack: Agentic Software Engineering and the Expanding Scope of SE

Robert Feldt , Per Lenberg , Julian Frattini , Dhasarathy Parthasarathy

Authors on Pith no claims yet

Pith reviewed 2026-05-10 10:29 UTC · model grok-4.3

classification 💻 cs.SE cs.AI

keywords semi-executable artifactsagentic software engineeringAI-augmented software engineeringsemi-executable stackdiagnostic modelorganizational routinessoftware engineering scope

0 comments

The pith

Software engineering is expanding from executable code to semi-executable artifacts that combine language, tools, workflows, and routines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper contends that AI agents and foundation models do not make software engineering obsolete but instead broaden the target of engineering work. What must now be built includes not only runnable programs but also instructional text, orchestration logic, control layers, and organizational patterns that execute only through human judgment or probabilistic outcomes. The Semi-Executable Stack supplies a six-layer reference model to locate any given contribution or friction point within this wider scope. The authors illustrate the model with three cases and supply a preserve-versus-purify rule for deciding which existing practices to retain. Readers who accept the framing gain a concrete way to map new agentic systems onto familiar engineering concerns rather than treating them as replacements.

Core claim

The important shift is not that software engineering loses relevance. It is that the thing being engineered expands beyond executable code to semi-executable artifacts; combinations of natural language, tools, workflows, control mechanisms, and organizational routines whose enactment depends on human or probabilistic interpretation rather than deterministic execution. The Semi-Executable Stack is introduced as a six-ring diagnostic reference model spanning executable artifacts, instructional artifacts, orchestrated execution, controls, operating logic, and societal and institutional fit. The model helps locate where a contribution, bottleneck, or organizational transition primarily sits, and

What carries the argument

The Semi-Executable Stack, a six-ring diagnostic reference model whose rings are executable artifacts, instructional artifacts, orchestrated execution, controls, operating logic, and societal and institutional fit.

If this is right

Any contribution or bottleneck can be assigned to one primary ring while noting its dependencies on adjacent rings.
Familiar objections to agentic AI in software engineering become concrete engineering targets rather than reasons to reject the transition.
Legacy processes, controls, and coordination routines can be evaluated with the preserve-versus-purify heuristic to decide what to keep versus simplify.
Organizational transitions can be diagnosed by tracking which rings are changing and which remain stable.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The model could be tested by mapping existing agent frameworks onto the six rings to reveal gaps not yet addressed by current tools.
It offers a way to compare how different organizations adapt their routines when introducing agentic systems.
The preserve-versus-purify heuristic could generate measurable criteria for retaining or retiring specific SE practices.

Load-bearing premise

The six-ring Semi-Executable Stack supplies a sufficiently complete and actionable diagnostic lens for locating contributions, bottlenecks, and transitions across the expanded scope of software engineering.

What would settle it

A detailed case study of an AI-augmented software project in which the six rings cannot be used to identify the primary location of a major bottleneck or contribution would falsify the model's claimed utility.

Figures

Figures reproduced from arXiv: 2604.15468 by Dhasarathy Parthasarathy, Julian Frattini, Per Lenberg, Robert Feldt.

read the original abstract

AI-based systems, currently driven largely by LLMs and tool-using agentic harnesses, are increasingly discussed as a possible threat to software engineering. Foundation models get stronger, agents can plan and act across multiple steps, and tasks such as scaffolding, routine test generation, straightforward bug fixing, and small integration work look more exposed than they did only a few years ago. The result is visible unease not only among students and junior developers, but also among experienced practitioners who worry that hard-won expertise may lose value. This paper argues for a different reading. The important shift is not that software engineering loses relevance. It is that the thing being engineered expands beyond executable code to semi-executable artifacts; combinations of natural language, tools, workflows, control mechanisms, and organizational routines whose enactment depends on human or probabilistic interpretation rather than deterministic execution. The Semi-Executable Stack is introduced as a six-ring diagnostic reference model for reasoning about that expansion, spanning executable artifacts, instructional artifacts, orchestrated execution, controls, operating logic, and societal and institutional fit. The model helps locate where a contribution, bottleneck, or organizational transition primarily sits, and which adjacent rings it depends on. The paper develops the argument through three worked cases, reframes familiar objections as engineering targets rather than reasons to dismiss the transition, and closes with a preserve-versus-purify heuristic for deciding which legacy software engineering processes, controls, and coordination routines should be kept and which should be simplified or redesigned. This paper is a conceptual keynote companion: diagnostic and agenda-setting rather than empirical.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reframes AI agents as expanding the scope of software engineering to semi-executable artifacts and introduces a six-ring diagnostic model to locate contributions and bottlenecks.

read the letter

The main point is that software engineering is not shrinking but widening to cover combinations of code, prompts, workflows, controls, and organizational routines that depend on human or probabilistic interpretation. The authors introduce the Semi-Executable Stack as a six-ring reference model to help place where a given change or contribution sits and what it depends on. They develop this through three worked cases and end with a preserve-versus-purify heuristic for deciding which legacy processes to keep or redesign. This is a clear shift away from replacement narratives toward adaptation. The cases make the layers concrete and the heuristic gives practitioners a usable starting point for thinking about transitions. The model itself is presented as new and diagnostic rather than derived from data. The soft spots are straightforward: the argument stays conceptual, the cases illustrate rather than test the rings against real outcomes or alternatives, and there is little comparison to existing socio-technical or layered models in the SE literature. Without that grounding the claim that the stack is a sufficiently complete lens remains an assertion. This paper is aimed at researchers and educators working on AI-augmented software engineering who need a different way to organize the changes they see in practice and teaching. A reader looking for framing rather than measurements or proofs will find it useful. It deserves peer review because the reframing is coherent, the model is original, and community feedback could tighten the rings or surface overlaps with prior work.

Referee Report

0 major / 2 minor

Summary. The paper claims that AI-driven agentic systems do not threaten software engineering but expand its scope beyond executable code to semi-executable artifacts (natural language, tools, workflows, controls, and organizational routines reliant on human or probabilistic interpretation). It introduces the Semi-Executable Stack as a six-ring diagnostic reference model (executable artifacts, instructional artifacts, orchestrated execution, controls, operating logic, and societal/institutional fit) to locate contributions and bottlenecks, develops the reframing via three worked cases, treats familiar objections as engineering targets, and proposes a preserve-versus-purify heuristic for legacy processes.

Significance. If the reframing holds, the paper supplies a useful conceptual tool for the SE community to map the shifting boundaries of the field amid agentic AI. Its diagnostic model and heuristic could help researchers and practitioners identify where new work fits in the expanded stack and decide what legacy elements to retain or redesign. The explicitly agenda-setting, non-empirical stance is a strength, as it avoids overclaiming while offering a structured lens for future contributions.

minor comments (2)

The abstract lists the six rings but does not provide even one-sentence characterizations of each; adding brief definitions would improve immediate accessibility without altering the conceptual focus.
The three worked cases are referenced as illustrations of the model, but the manuscript would benefit from an explicit mapping table or paragraph showing which rings each case primarily engages; this is a clarity issue rather than a challenge to the central claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of the manuscript and for the recommendation of minor revision. The summary accurately reflects the paper's positioning as a conceptual, agenda-setting contribution that introduces the Semi-Executable Stack as a diagnostic model for the expanding scope of software engineering amid agentic AI systems. We appreciate the recognition of its potential value to researchers and practitioners in locating contributions and bottlenecks within the six-ring structure, as well as the preserve-versus-purify heuristic for legacy processes. We will make any minor editorial adjustments in the revised version.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is explicitly conceptual and agenda-setting. It introduces the six-ring Semi-Executable Stack as an independent diagnostic reference model, develops the reframing through three worked cases, and offers a preserve-versus-purify heuristic. No equations, fitted parameters, predictions, or derivations are present that could reduce to inputs by construction. The central claim concerns an expansion of scope rather than a mechanism derived from its own assumptions or self-citations, rendering the argument self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper is conceptual and introduces a new diagnostic model without empirical data or formal proof in the abstract; it rests on the domain assumption that AI capabilities are expanding SE tasks and postulates the six-ring stack as the organizing lens.

axioms (1)

domain assumption AI-based systems driven by LLMs and agentic harnesses are increasingly capable of tasks such as scaffolding, routine test generation, and bug fixing that were previously the domain of software engineers.
Stated as the premise creating visible unease among practitioners and students.

invented entities (1)

Semi-Executable Stack no independent evidence
purpose: Six-ring diagnostic reference model spanning executable artifacts, instructional artifacts, orchestrated execution, controls, operating logic, and societal/institutional fit.
Newly proposed framework for locating contributions and bottlenecks in the expanded SE scope.

pith-pipeline@v0.9.0 · 5590 in / 1406 out tokens · 49136 ms · 2026-05-10T10:29:45.188779+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 27 canonical work pages

[1]

Transitioning manual system test suites to automated testing: An industrial case study

Emil Alégroth, Robert Feldt, and Helena Holmström Olsson. Transitioning manual system test suites to automated testing: An industrial case study. In2013 IEEE Sixth International Conference on Software Testing, Verification and Validation, pages 56–65, 2013. doi: 10.1109/ICST.2013.14

work page doi:10.1109/icst.2013.14 2013
[2]

Anthropic economic index, january 2026 report, 2026

Anthropic. Anthropic economic index, january 2026 report, 2026. URLhttps://www.an thropic.com/research/anthropic-economic-index-january-2026-report . Accessed April 22, 2026

2026
[3]

Baxter and Ian Sommerville

Gordon D. Baxter and Ian Sommerville. Socio-technical systems: From design methods to systems engineering.Interacting with Computers, 23(1):4–17, 2011

2011
[4]

Measuring the impact of early-2025 AI on experienced open-source developer productivity, 2025

Joel Becker, Nate Rush, Elizabeth Barnes, and David Rein. Measuring the impact of early-2025 AI on experienced open-source developer productivity, 2025. URL https: //metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/ . Model Evaluation and Threat Research (METR). Published July 10, 2025; Accessed April 22, 2026

2025
[5]

Brooks, Jr

Frederick P. Brooks, Jr. No silver bullet: Essence and accidents of software engineering. Computer, 20(4):10–19, 1987. doi: 10.1109/MC.1987.1663532

work page doi:10.1109/mc.1987.1663532 1987
[6]

Canaries in the coal mine? six facts about the recent employment effects of artificial intelligence, 2025

Erik Brynjolfsson, Bharat Chandar, and Ruyu Chen. Canaries in the coal mine? six facts about the recent employment effects of artificial intelligence, 2025. URLhttps: //digitaleconomy.stanford.edu/publication/canaries-in-the-coal-mine-six -facts-about-the-recent-employment-effects-of-artificial-intelligence/ . Published November 13, 2025; Accessed April 22, 2026

2025
[7]

AI in software engineering at google: Progress and the path ahead, 2024

Satish Chandra and Maxim Tabachnyk. AI in software engineering at google: Progress and the path ahead, 2024. URLhttps://research.google/blog/ai-in-software-enginee ring-at-google-progress-and-the-path-ahead/ . Published June 6, 2024; Accessed April 22, 2026

2024
[8]

Addison-Wesley Professional, 3rd edition, 2011

Mary Beth Chrissis, Mike Konrad, and Sandy Shrum.CMMI for Development: Guidelines for Process Integration and Product Improvement. Addison-Wesley Professional, 3rd edition, 2011. 16

2011
[9]

2025 state of AI-assisted software development report, 2025

Derek DeBellis, Kevin Storer, Nathen Harvey, Matt Beane, Rob Edwards, Edward Fraser, Ben Good, Eirini Kalliamvakou, Gene Kim, Eric Maxwell, Sarah D’Angelo, Sarah Inman, Ambar Murillo, and Daniella Villalba. 2025 state of AI-assisted software development report, 2025. URL https://research.google/pubs/dora-2025-state-of-ai-assiste d-software-development-rep...

2025
[10]

The cybernetic teammate: A field experiment on generative AI reshaping teamwork and expertise, 2025

Fabrizio Dell’Acqua, Charles Ayoubi, Hila Lifshitz, Raffaella Sadun, Ethan Mollick, Lilach Mollick, Yi Han, Jeff Goldman, Hari Nair, Stewart Taub, and Karim Lakhani. The cybernetic teammate: A field experiment on generative AI reshaping teamwork and expertise, 2025. URLhttps://www.nber.org/papers/w33641. NBER Working Paper 33641

2025
[11]

Challenges in testing large language model based software: A faceted taxonomy.ACM Transactions on Software Engineering and Methodology, 35(4):1–38, 2026

Felix Dobslaw, Robert Feldt, Juyeon Yoon, and Shin Yoo. Challenges in testing large language model based software: A faceted taxonomy.ACM Transactions on Software Engineering and Methodology, 35(4):1–38, 2026. doi: 10.1145/3806396

work page doi:10.1145/3806396 2026
[12]

Regulation (eu) 2024/1689 laying down harmonised rules on artificial intelligence (AI act), 2024

European Union. Regulation (eu) 2024/1689 laying down harmonised rules on artificial intelligence (AI act), 2024. URLhttps://eur-lex.europa.eu/legal-content/en/LSU/ ?qid=1744648637579&uri=CELEX%3A32024R1689 . Official summary and legal reference; Accessed April 22, 2026

2024
[13]

Cognition in software engineering: A taxonomy and survey of a half-century of research.ACM Computing Surveys, 54(11s): 1–36, 2022

Fabian Fagerholm, Michael Felderer, Davide Fucci, Michael Unterkalmsteiner, Bogdan Marculescu, Markus Martini, Lucas Gren, Lars Göran Wallgren Tengberg, Robert Feldt, Antti Lehtelä, Bettina Nagyváradi, and Jehan Khattak. Cognition in software engineering: A taxonomy and survey of a half-century of research.ACM Computing Surveys, 54(11s): 1–36, 2022. doi: ...

work page doi:10.1145/3508359 2022
[14]

Angela Fan, Beliz Gokkaya, Mark Harman, Mitya Lyubarskiy, Shubho Sengupta, Shin Yoo, and Jie M. Zhang. Large language models for software engineering: Survey and open problems. InProceedings of the 45th IEEE/ACM International Conference on Software Engineering: Future of Software Engineering, pages 31–53, 2023. doi: 10.1109/ICSE-FoSE5 9343.2023.00008

work page doi:10.1109/icse-fose5 2023
[15]

Ways of applying artificial intelligence in software engineering

Robert Feldt, Francisco Gomes de Oliveira Neto, and Richard Torkar. Ways of applying artificial intelligence in software engineering. InProceedings of the 6th International Work- shop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE@ICSE), pages 35–41, 2019

2019
[16]

Towards autonomous testing agents via conversational large language models

Robert Feldt, Sungmin Kang, Juyeon Yoon, and Shin Yoo. Towards autonomous testing agents via conversational large language models. InProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 1688–1693,
[17]

doi: 10.1109/ASE56229.2023.00148

work page doi:10.1109/ase56229.2023.00148 2023
[18]

A software engineering perspective on engineering machine learning systems: State of the art and challenges.Journal of Systems and Software, 180:111031, 2021

Görkem Giray. A software engineering perspective on engineering machine learning systems: State of the art and challenges.Journal of Systems and Software, 180:111031, 2021. doi: 10.1016/j.jss.2021.111031

work page doi:10.1016/j.jss.2021.111031 2021
[19]

Wobbrock, and Katharina Reinecke

Daniel Graziotin, Per Lenberg, Robert Feldt, and Stefan Wagner. Psychometrics in behavioral software engineering: A methodological introduction with guidelines.ACM Transactions on Software Engineering and Methodology, 31(1):1–36, 2022. doi: 10.1145/34 69888

work page doi:10.1145/34 2022
[20]

Cross-functional AI task forces (X-FAITs) for AI transfor- mation of software organizations

Lucas Gren and Robert Feldt. Cross-functional AI task forces (X-FAITs) for AI transfor- mation of software organizations. InProceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering (EASE), pages 793–796, 2025. doi: 10.1145/3756681.3757015. 17

work page doi:10.1145/3756681.3757015 2025
[21]

Large Language Models for Software Engineering: A Sys- tematic Literature Review,

Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Daniel Luo, Xiapu Luo, David Lo, John Grundy, and Haoyu Wang. Large language models for software engineering: A systematic literature review.ACM Transactions on Software Engineering and Methodology, 33(8):1–79, 2024. doi: 10.1145/3695988

work page doi:10.1145/3695988 2024
[22]

Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik R

Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik R. Narasimhan. SWE-bench: Can language models resolve real-world github issues? InThe Twelfth International Conference on Learning Representations (ICLR), 2024. URL https://openreview.net/forum?id=VTF8yNQM66. ICLR 2024; Accessed April 22, 2026

2024
[23]

Gatelens: A reasoning-enhanced LLM agent for automotive software release analytics, 2025

Arsham Gholamzadeh Khoee, Shuai Wang, Robert Feldt, Dhasarathy Parthasarathy, and Yinan Yu. Gatelens: A reasoning-enhanced LLM agent for automotive software release analytics, 2025. URL https://arxiv.org/abs/2503.21735. arXiv:2503.21735; Accessed April 22, 2026

work page arXiv 2025
[24]

Gonogo: AnefficientLLM-basedmulti-agentsystem for streamlining automotive software release decision-making

Arsham Gholamzadeh Khoee, Yinan Yu, Robert Feldt, Andris Freimanis, Patrick Andersson Rhodin, andDhasarathyParthasarathy. Gonogo: AnefficientLLM-basedmulti-agentsystem for streamlining automotive software release decision-making. InProceedings of the 36th IFIP WG 6.1 International Conference on Testing Software and Systems (ICTSS), volume 15383 of Lecture...

work page doi:10.1007/978-3-031-80889-0_3 2025
[25]

Amy J. Ko, Robin Abraham, Laura Beckwith, Alan Blackwell, Margaret Burnett, Martin Erwig, Chris Scaffidi, Joseph Lawrance, Henry Lieberman, Brad Myers, Mary Beth Rosson, Gregg Rothermel, Mary Shaw, and Susan Wiedenbeck. The state of the art in end-user software engineering.ACM Computing Surveys, 43(3):1–44, 2011. doi: 10.1145/1922649.19 22658

work page doi:10.1145/1922649.19 2011
[26]

Machine learning operations (MLOps): Overview, definition, and architecture, 2023

Dominik Kreuzberger, Niklas Kühl, and Sebastian Hirschl. Machine learning operations (MLOps): Overview, definition, and architecture, 2023. URLhttps://arxiv.org/abs/22 05.02302. arXiv:2205.02302; Accessed April 22, 2026

work page arXiv 2023
[27]

arXiv preprint arXiv:2503.14499 , year=

Thomas Kwa, Ben West, Joel Becker, Amy Deng, Katharyn Garcia, Max Hasin, Sami Jawhar, Megan Kinniment, Nate Rush, Sydney von Arx, Ryan Bloom, Thomas Broadley, Haoxing Du, Brian Goodrich, Nikola Jurkovic, Luke Harold Miles, Seraphina Nix, Tao Lin, Neev Parikh, David Rein, Lucas Jun Koba Sato, Hjalmar Wijk, Daniel M. Ziegler, Elizabeth Barnes, and Lawrence ...

work page arXiv 2025
[28]

Behavioral software engineering: A definition and systematic literature review.Journal of Systems and Software, 107:15–37,

Per Lenberg, Robert Feldt, and Lars Göran Wallgren. Behavioral software engineering: A definition and systematic literature review.Journal of Systems and Software, 107:15–37,
[29]

doi: 10.1016/j.jss.2015.04.084

work page doi:10.1016/j.jss.2015.04.084 2015
[30]

Human factors related challenges in software engineering: An industrial perspective

Per Lenberg, Robert Feldt, and Lars Göran Wallgren. Human factors related challenges in software engineering: An industrial perspective. In2015 IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE), pages 43–49, 2015. doi: 10.1109/CHASE.2015.13

work page doi:10.1109/chase.2015.13 2015
[31]

Manning, Christo- pher Ré, et al

Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Ya- sunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christo- pher Ré, et al. Holistic evaluation of language models.Transactions on Machine Learning Research, 2023....

2023
[32]

Springer, 2006

Henry Lieberman, Fabio Paternò, and Volker Wulf, editors.End User Development, volume 9 ofHuman-Computer Interaction Series. Springer, 2006

2006
[33]

Large language model-based agents for software engineering: A survey.arXiv preprint arXiv:2409.02977, 2024

Junwei Liu, Kaixin Wang, Yixuan Chen, Xin Peng, Zhenpeng Chen, Lingming Zhang, and Yiling Lou. Large language model-based agents for software engineering: A survey, 2024. URLhttps://arxiv.org/abs/2409.02977. arXiv:2409.02977; Accessed April 22, 2026

work page arXiv 2024
[34]

Software engineering for AI-based systems: A survey.ACM Transactions on Software Engineering and Methodology, 31(2), 2022

Silverio Martínez-Fernández, Justus Bogner, Xavier Franch, Marc Oriol, Julien Siebert, Adam Trendowicz, Anna Maria Vollmer, and Stefan Wagner. Software engineering for AI-based systems: A survey.ACM Transactions on Software Engineering and Methodology, 31(2), 2022. doi: 10.1145/3487043

work page doi:10.1145/3487043 2022
[35]

2025: The year the frontier firm is born, 2025

Microsoft WorkLab. 2025: The year the frontier firm is born, 2025. URLhttps://www.mi crosoft.com/en-us/worklab/work-trend-index/2025-the-year-the-frontier-fir m-is-born. Published April 29, 2025; Accessed April 22, 2026

2025
[36]

ACM TOSEM32, 3, Article 75 (April 2023), 30 pages

Michel Nass, Emil Alégroth, Robert Feldt, Maurizio Leotta, and Filippo Ricca. Similarity- based web element localization for robust test automation.ACM Transactions on Software Engineering and Methodology, 32(3):1–30, 2023. doi: 10.1145/3571855

work page doi:10.1145/3571855 2023
[37]

AI risk management framework, 2023

National Institute of Standards and Technology. AI risk management framework, 2023. URL https://www.nist.gov/itl/ai-risk-management-framework. Accessed April 22, 2026

2023
[38]

Artificial intelligence risk management framework: Generative artificial intelligence profile, 2024

National Institute of Standards and Technology. Artificial intelligence risk management framework: Generative artificial intelligence profile, 2024. URLhttps://www.nist.gov/p ublications/artificial-intelligence-risk-management-framework-generative-a rtificial-intelligence. Published July 26, 2024; Accessed April 22, 2026

2024
[39]

An empirical study on decision-making aspects in responsible software engineering for AI

Lekshmi Murali Rani, Faezeh Mohammadi, Robert Feldt, and Richard Berntsson Svensson. An empirical study on decision-making aspects in responsible software engineering for AI. In2025 IEEE/ACM 47th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pages 575–585. IEEE, 2025. doi: 10.1109/ICSE-SEI P66354.2025.00056

work page doi:10.1109/icse-sei 2025
[40]

Bridging the socio- emotional gap: The functional dimension of human-AI collaboration for software engineering,

Lekshmi Murali Rani, Richard Berntsson Svensson, and Robert Feldt. Bridging the socio- emotional gap: The functional dimension of human-AI collaboration for software engineering,
[41]

arXiv:2601.19387; Accessed April 22, 2026

URL https://arxiv.org/abs/2601.19387. arXiv:2601.19387; Accessed April 22, 2026

work page arXiv 2026
[42]

Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-François Crespo, and Dan Dennison

D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-François Crespo, and Dan Dennison. Hidden technical debt in machine learning systems. InAdvances in Neural Information Processing Systems (NeurIPS), pages 2503–2511, 2015

2015
[43]

The unfulfilled potential of data-driven decision making in agile software development

Richard Berntsson Svensson, Robert Feldt, and Richard Torkar. The unfulfilled potential of data-driven decision making in agile software development. InAgile Processes in Software Engineering and Extreme Programming, volume355ofLecture Notes in Business Information Processing, pages 69–85. Springer, 2019. doi: 10.1007/978-3-030-19034-7_5

work page doi:10.1007/978-3-030-19034-7_5 2019
[44]

Vaccargiu, S

Mahan Tafreshipour, Aaron Imani, Eric Huang, Eduardo Santana de Almeida, Thomas Zimmermann, and Iftekhar Ahmed. Prompting in the wild: An empirical study of prompt evolution in software repositories. InProceedings of the 22nd IEEE/ACM International Conference on Mining Software Repositories (MSR), 2025. doi: 10.1109/MSR66628.2025.00

work page doi:10.1109/msr66628.2025.00 2025
[45]

URL https://2025.msrconf.org/details/msr-2025-technical-papers/10/Pro 19 mpting-in-the-Wild-An-Empirical-Study-of-Prompt-Evolution-in-Software-Rep ositorie

2025
[46]

Unpacking organizational change in AI transfor- mations of software engineering

Theocharis Tavantzis and Robert Feldt. Unpacking organizational change in AI transfor- mations of software engineering. InProceedings of the 18th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE@ICSE), pages 149–160,
[47]

doi: 10.1109/CHASE66643.2025.00026

work page doi:10.1109/chase66643.2025.00026 2025
[48]

Devanbu, and Michael Pradel

Shuai Wang, Yinan Yu, Robert Feldt, and Dhasarathy Parthasarathy. Automating a complete software test process using LLMs: An automotive case study. InProceedings of the 47th IEEE/ACM International Conference on Software Engineering (ICSE), pages 373–384, 2025. doi: 10.1109/ICSE55347.2025.00211

work page doi:10.1109/icse55347.2025.00211 2025
[49]

The github recent bugs dataset for evaluating llm-based debugging applications,

Juyeon Yoon, Robert Feldt, and Shin Yoo. Intent-driven mobile GUI testing with au- tonomous large language model agents. In2024 IEEE Conference on Software Testing, Ver- ification and Validation (ICST), pages 129–139, 2024. doi: 10.1109/ICST60714.2024.00020. A Keynote Title and Abstract Title.Agentic Software Engineering Will Eat the World: AI-Based Syste...

work page doi:10.1109/icst60714.2024.00020 2024