arxiv: 2604.16393 · v3 · submitted 2026-03-28 · 💻 cs.SE · cs.HC

Recognition: 2 theorem links

· Lean Theorem

How Do Developers Interact with AI? An Exploratory Study on Modeling Developer Programming Behavior

Yinan Wu , Ze Shi Li , Kathryn Thomasset Stolee , Bowen Xu

Authors on Pith no claims yet

Pith reviewed 2026-05-14 22:10 UTC · model grok-4.3

classification 💻 cs.SE cs.HC

keywords AI-assisted programmingdeveloper behavior modelintention and emotionuser studyprogramming workflowsAI tool interactionemotional impact

0 comments

The pith

Developers using AI during programming tasks focus more on creating and verifying code while maintaining steadier emotions than those coding without AI.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reports results from a mixed-methods study involving 76 developers who completed programming tasks in either Python or Java, divided into AI-assisted and non-AI groups. Participants retrospectively labeled their intentions, actions, tools, and emotions while reviewing screen recordings of their work, with additional data from surveys and interviews. This led to the S-IASE model, which frames any development state through four dimensions: the developer's intention, the concrete action taken, the supporting tool in use, and the accompanying emotion. Analysis showed AI users spent more time actively generating and evaluating code with fewer emotional swings, while some reported guilt about depending on AI. A sympathetic reader would care because these hidden dimensions suggest AI tools could be designed to better match or support the full experience of coding rather than just speeding up output.

Core claim

The central claim is that developer programming behavior, especially in the presence of AI, can be described using the S-IASE model with four dimensions—intention, action, supporting tool, and emotion—for any given development state. The study data revealed distinct aggregated patterns: AI-assisted developers engaged more in active code creation, evaluation, and verification, and displayed emotionally stable flows, unlike the fluctuating emotions seen in the non-AI group. Interviews added that reliance on AI sometimes produced impostor-like feelings of guilt or self-doubt.

What carries the argument

The S-IASE model, a four-dimensional framework that characterizes each development state by the developer's intention, the action performed, the tool used, and the emotion experienced.

If this is right

AI assistance shifts developer effort toward actively creating code and verifying AI-generated results rather than other activities.
Developers experience fewer emotional fluctuations when using AI tools compared to traditional non-AI workflows.
Some developers report guilt or self-doubt tied to relying on AI, even when performance improves.
Sequential patterns in the four dimensions distinguish AI-assisted programming from non-AI programming.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The model could support real-time AI features that detect shifting intentions and adjust suggestions accordingly.
Emotional stability observed with AI might lower burnout risk during extended sessions, though this link remains untested.
Future AI tools could be tuned to preserve some non-AI emotional rhythms when developers prefer them.

Load-bearing premise

Developers can accurately recall and label their true intentions and emotions after the tasks by watching recordings of their own screens.

What would settle it

A new experiment that collects real-time emotion data via heart-rate monitors or concurrent verbal reports during identical tasks and checks whether those measures match the retrospective labels used to build the S-IASE model.

Figures

Figures reproduced from arXiv: 2604.16393 by Bowen Xu, Kathryn Thomasset Stolee, Yinan Wu, Ze Shi Li.

**Figure 1.** Figure 1: Overview of Our Study Steps the AI-assisted group were explicitly informed that they could freely install and use their preferred AI assistants (e.g., ChatGPT or GitHub Copilot) without restriction, whereas participants in the non-AI group were instructed not to use any AI assistants during the tasks. Then, we introduced participants to the tasks and group assignment (i.e., AI or non-AI). Participants star… view at source ↗

**Figure 2.** Figure 2: Our Annotation Tool’s GUI: The participant [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Mean number of intention occurrences per participant across the four groups. The x-axis shows [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: Mean number of action occurrences per participant across the four groups. The x-axis shows action [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: Mean supporting tool and emotion occurrences per participant across the four groups. The x-axis [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

read the original abstract

Artificial Intelligence (AI) is reshaping how developers adopt software engineering practices, yet the multi-dimensional nature of developer-AI interaction remains under-explored. Prior studies have primarily examined dimensions observable from developer activities such as "Prompt Crafting" and "Code Editing," overlooking how hidden intentions and emotional dimensions intertwine with concrete actions during AI-assisted programming. To understand this phenomenon, we conducted a mixed-methods study with 76 developers split into AI-assisted and non-AI groups. Each performed programming tasks (Python with API management or Java with SQL). Developers retrospectively labeled their self-reported intentions, tool-supported actions, and emotions from screen recordings, supplemented by surveys and interviews. Our user study resulted in a novel model named S-IASE with four dimensions to describe programming behavior: intention, action, supporting tool, and emotion for a given development state. Our analysis reveals aggregated and sequential behavioral patterns. For example, using AI assistants often makes developers more focused on actively creating code, evaluating, and verifying generated results. AI-assisted participants showed emotionally stable development flow, as opposed to non-AI-assisted participants who experienced more fluctuating emotions. Interviews revealed further nuance: some developers reported impostor-like feelings, expressing guilt or self-doubt about relying on AI. Our work bridges an important gap in understanding the complexities of developer-AI interaction in programming context.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives an exploratory four-dimensional model of developer-AI interaction but builds it on retrospective self-labeling that leaves the hidden-state claims open to bias.

read the letter

The main thing here is a mixed-methods study of 76 developers split into AI-assisted and non-AI groups on Python and Java tasks. Participants watched their own screen recordings afterward and labeled intentions, actions, tools, and emotions for each state, then the authors extracted patterns such as AI users spending more time on creation, evaluation, and verification with steadier emotional flow. Interviews added notes on impostor feelings and guilt when relying on AI. The S-IASE model is presented as new for explicitly folding in intention and emotion alongside the more common action and tool dimensions. That extension is the clearest addition over prior work focused on prompt crafting or editing logs. The group comparison and sequential pattern extraction are straightforward for an exploratory design and give concrete examples that tool builders could use. The soft spot is the labeling step itself. Retrospective assignment of internal states from recordings is vulnerable to recall bias, social-desirability effects, and the gap between remembered and experienced affect. No inter-rater checks or real-time validation metrics are described, so systematic distortion in the labels would carry straight into the model dimensions and the reported patterns. The interviews are also retrospective and do not supply independent ground truth. This is the kind of paper that belongs in software engineering or HCI venues that publish user studies. Readers designing or evaluating AI coding tools would find the observed differences useful as starting points, even if the model needs tighter validation. I would send it to peer review so referees can press on the method details and suggest ways to strengthen the evidence for the four dimensions.

Referee Report

1 major / 2 minor

Summary. The paper reports results from a mixed-methods study with 76 developers split into AI-assisted and non-AI groups who completed Python (API) or Java (SQL) tasks. Participants viewed their own screen recordings after the fact to retrospectively label intentions, actions, supporting tools, and emotions for each development state; these labels were aggregated with survey and interview data. The central contribution is the derivation of the S-IASE model, a four-dimensional taxonomy (intention, action, supporting tool, emotion) claimed to capture programming behavior. Analysis of the labeled sequences yields patterns such as greater focus on creation/verification and emotionally stable flow among AI users, contrasted with fluctuating emotions in the non-AI group, plus interview reports of impostor-like guilt.

Significance. If the labeling procedure can be shown to recover contemporaneous states with acceptable fidelity, the S-IASE model supplies a useful integrative lens that moves beyond the observable-action focus of prior AI-assistance studies. The 76-participant sample and mixed-methods design are strengths for an exploratory study, generating concrete behavioral sequences and emotional contrasts that could guide tool design and training interventions.

major comments (1)

[§4] §4 (Data Collection and Labeling Procedure): The S-IASE dimensions and all reported patterns are extracted directly from participants' retrospective self-labels of intentions and emotions. No inter-rater reliability statistics, concurrent think-aloud validation, or comparison against real-time measures are reported; the method therefore rests on the untested assumption that post-hoc reconstruction accurately recovers hidden states rather than introducing recall bias or social-desirability distortion. This is load-bearing for the central claim that the four-dimensional model describes actual programming behavior.

minor comments (2)

[Abstract and §5] Abstract and §5: The abstract states that AI-assisted participants showed 'emotionally stable development flow' while non-AI participants experienced 'more fluctuating emotions.' Provide the exact operationalization (e.g., variance of labeled emotion scores per minute or transition counts) and the statistical test used to support this contrast.
[§6] §6 (Interview Analysis): The impostor-feeling theme is presented qualitatively. Indicate how many participants expressed it and whether it differed systematically between AI and non-AI groups.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our exploratory study. We address the single major comment point-by-point below, agreeing that the retrospective labeling method requires explicit discussion of its limitations. We will revise the manuscript accordingly to improve transparency and strengthen the presentation of our contributions.

read point-by-point responses

Referee: [§4] §4 (Data Collection and Labeling Procedure): The S-IASE dimensions and all reported patterns are extracted directly from participants' retrospective self-labels of intentions and emotions. No inter-rater reliability statistics, concurrent think-aloud validation, or comparison against real-time measures are reported; the method therefore rests on the untested assumption that post-hoc reconstruction accurately recovers hidden states rather than introducing recall bias or social-desirability distortion. This is load-bearing for the central claim that the four-dimensional model describes actual programming behavior.

Authors: We agree that the retrospective self-labeling procedure is foundational to deriving the S-IASE model and that the lack of concurrent validation or reliability metrics represents a methodological limitation. Our design choice to use post-task labeling assisted by screen recordings was deliberate: concurrent methods such as think-aloud protocols risk altering natural developer behavior, cognitive flow, and emotional states during the programming tasks. The recordings were provided specifically to support accurate recall of intentions, actions, tools, and emotions. We triangulated the labels with survey and interview data to mitigate bias. Nevertheless, we acknowledge that recall bias and social-desirability effects cannot be fully ruled out without additional validation. In the revised manuscript we will (1) expand the Limitations section with an explicit discussion of these threats and (2) add a paragraph outlining future work that could include concurrent think-aloud or physiological measures to test the fidelity of the retrospective approach. revision: yes

Circularity Check

0 steps flagged

No circularity: S-IASE model derived directly from empirical participant data

full rationale

The paper is a mixed-methods empirical study that collects screen recordings, retrospective self-labels for intentions/emotions/actions/tools, surveys, and interviews from 76 developers. The S-IASE four-dimensional model is presented as emerging from pattern analysis of these labels (aggregated and sequential behaviors). No equations, fitted parameters, predictions, or mathematical derivations appear; the taxonomy is a descriptive organization of observed data rather than a reduction to prior inputs by construction. Self-citations are absent from the provided text and not load-bearing for any derivation. This is standard exploratory qualitative work with no circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The model rests on the assumption that retrospective labeling captures real-time states; no free parameters or invented physical entities, only the proposed model itself.

axioms (1)

domain assumption Retrospective self-labeling from screen recordings accurately reflects real-time intentions, actions, and emotions
Central to the data collection method described in the abstract

invented entities (1)

S-IASE model no independent evidence
purpose: Framework to describe programming behavior across intention, action, tool, and emotion dimensions
Newly proposed based on the study observations

pith-pipeline@v0.9.0 · 5544 in / 1229 out tokens · 42824 ms · 2026-05-14T22:10:14.852544+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our user study resulted in a novel model named S-IASE with four dimensions to describe programming behavior: intention, action, supporting tool, and emotion for a given development state.
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_injective unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Developers retrospectively labeled their self-reported intentions, tool-supported actions, and emotions from screen recordings

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

75 extracted references · 75 canonical work pages · 2 internal anchors

[1]

[n. d.]. Otter.ai - AI Meeting Note Taker & Real-time AI Transcription. https://otter.ai/

work page
[2]

AI | 2024 Stack Overflow Developer Survey

2024. AI | 2024 Stack Overflow Developer Survey. https://survey.stackoverflow.co/2024/ai

work page 2024
[3]

2025 Developer Survey

2025. 2025 Developer Survey. https://survey.stackoverflow.co/2025/

work page 2025
[4]

GitHub Copilot·Your AI pair programmer

2025. GitHub Copilot·Your AI pair programmer. https://github.com/features/copilot

work page 2025
[5]

Aldeida Aleti. 2023. Software Testing of Generative AI Systems: Challenges and Opportunities. In2023 IEEE/ACM International Conference on Software Engineering: Future of Software Engineering (ICSE-FoSE). 4–14. doi:10.1109/ICSE- FoSE59343.2023.00009

work page doi:10.1109/icse- 2023
[6]

Eman Abdullah AlOmar, Anushkrishna Venkatakrishnan, Mohamed Wiem Mkaouer, Christian Newman, and Ali Ouni. 2024. How to refactor this code? An exploratory study on developer-ChatGPT refactoring conversations. In Proceedings of the 21st International Conference on Mining Software Repositories. 202–206. doi:10.1145/3643991.3645081

work page doi:10.1145/3643991.3645081 2024
[7]

Matin Amoozadeh, Daye Nam, Daniel Prol, Ali Alfageeh, James Prather, Michael Hilton, Sruti Srinivasa Ragavan, and Amin Alipour. 2024. Student-AI Interaction: A Case Study of CS1 students. InProceedings of the 24th Koli Calling International Conference on Computing Education Research. 1–13. doi:10.1145/3699538.3699567

work page doi:10.1145/3699538.3699567 2024
[8]

2001.A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives: complete edition

Lorin W Anderson and David R Krathwohl. 2001.A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives: complete edition. Addison Wesley Longman, Inc

work page 2001
[9]

Shraddha Barke, Michael B James, and Nadia Polikarpova. 2023. Grounded copilot: How programmers interact with code-generating models.Proceedings of the ACM on Programming Languages7, OOPSLA1 (2023), 85–111. doi:10.1145/3586030

work page doi:10.1145/3586030 2023
[10]

Yoav Benjamini and Yosef Hochberg. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal statistical society: series B (Methodological)57, 1 (1995), 289–300. doi:10.1111/j.2517- 6161.1995.tb02031.x

work page doi:10.1111/j.2517- 1995
[11]

Michael Bidollahkhani and Julian M. Kunkel. 2024. Revolutionizing System Reliability: The Role of AI in Predictive Maintenance Strategies. arXiv:2404.13454 [cs.AI] https://arxiv.org/abs/2404.13454

work page arXiv 2024
[12]

1956.Taxonomy of educational objectives: The classification of educational goals

Benjamin S Bloom, Max D Engelhart, Edward J Furst, Walker H Hill, David R Krathwohl, et al. 1956.Taxonomy of educational objectives: The classification of educational goals. Handbook 1: Cognitive domain. Longman New York

work page 1956
[13]

Adam Brown, Sarah D’Angelo, Ambar Murillo, Ciera Jaspan, and Collin Green. 2024. Identifying the factors that influence trust in AI code completion. InProceedings of the 1st ACM International Conference on AI-Powered Software. 1–9. doi:10.1145/3664646.3664757

work page doi:10.1145/3664646.3664757 2024
[14]

Anita Carleton, Davide Falessi, Hongyu Zhang, and Xin Xia. 2024. Generative AI: Redefining the Future of Software Engineering.IEEE Software41, 6 (2024), 34–37. doi:10.1109/MS.2024.3441889

work page doi:10.1109/ms.2024.3441889 2024
[15]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde De Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code.arXiv preprint arXiv:2107.03374(2021). doi:10.48550/arXiv.2107.03374

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2107.03374 2021
[16]

Ben Cheng, Chetan Arora, Xiao Liu, Thuong Hoang, Yi Wang, and John Grundy. 2023. Multi-modal emotion recognition for enhanced requirements engineering: a novel approach. In2023 IEEE 31st International Requirements Engineering Conference (RE). IEEE, 299–304. doi:10.1109/RE57278.2023.00039

work page doi:10.1109/re57278.2023.00039 2023
[17]

Rudrajit Choudhuri, Bianca Trinkenreich, Rahul Pandita, Eirini Kalliamvakou, Igor Steinmacher, Marco Gerosa, Christopher Sanchez, and Anita Sarma. 2025. What guides our choices? Modeling developers’ trust and behavioral intentions towards genai. In2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). 1691–1703. doi:10.1109/ICSE55347....

work page doi:10.1109/icse55347.2025.00087 2025
[18]

2013.Applied multiple regression/correlation analysis for the behavioral sciences

Jacob Cohen, Patricia Cohen, Stephen G West, and Leona S Aiken. 2013.Applied multiple regression/correlation analysis for the behavioral sciences. Routledge

work page 2013
[19]

Cursor. 2023. Cursor. https://cursor.so

work page 2023
[20]

Kostadin Damevski, David C Shepherd, Johannes Schneider, and Lori Pollock. 2016. Mining sequences of developer interactions in visual studio for usage smells.IEEE Transactions on Software Engineering43, 4 (2016), 359–371. doi:10.1109/TSE.2016.2592905

work page doi:10.1109/tse.2016.2592905 2016
[21]

Denae Ford, Tom Zimmermann, Christian Bird, and Nachiappan Nagappan. 2017. Characterizing software engineering work with personas based on knowledge worker actions. In2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, 394–403. doi:10.1109/ESEM.2017.54

work page doi:10.1109/esem.2017.54 2017
[22]

Nicholas Gardella, Raymond Pettit, and Sara L Riggs. 2024. Performance, Workload, Emotion, and Self-Efficacy of Novice Programmers Using AI Code Generation. InProceedings of the 2024 on Innovation and Technology in Computer Science Education V. 1. 290–296. doi:10.1145/3649217.3653615

work page doi:10.1145/3649217.3653615 2024
[23]

Daniela Girardi, Nicole Novielli, Davide Fucci, and Filippo Lanubile. 2020. Recognizing developers’ emotions while programming. InProceedings of the ACM/IEEE 42nd international conference on software engineering. 666–677. doi:10. 1145/3377811.3380374

work page arXiv 2020
[24]

GitHub. 2023. GitHub Copilot. https://github.com/features/copilot Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE113. Publication date: July 2026. FSE113:22 Yinan Wu, Ze Shi Li, Kathryn Thomasset Stolee, and Bowen Xu

work page 2023
[25]

GitHub. 2023. GitHub Copilot Chat. https://github.com/features/copilot

work page 2023
[26]

Daniel Graziotin, Xiaofeng Wang, and Pekka Abrahamsson. 2015. Do feelings matter? On the correlation of affects and the self-assessed productivity in software engineering.Journal of Software: Evolution and Process27, 7 (2015), 467–487. doi:10.1002/smr.1673

work page doi:10.1002/smr.1673 2015
[27]

Paloma Guenes, Rafael Tomaz, Marcos Kalinowski, Maria Teresa Baldassarre, and Margaret-Anne Storey. 2024. Impostor phenomenon in software engineers. InProceedings of the 46th International Conference on Software Engineering: Software Engineering in Society. 96–106. doi:10.1145/3639475.3640114

work page doi:10.1145/3639475.3640114 2024
[28]

Melissa Harper and Patricia Cole. 2012. Member checking: Can benefits be gained similar to group therapy.The qualitative report17, 2 (2012), 510–517. doi:10.46743/2160-3715/2012.2139

work page doi:10.46743/2160-3715/2012.2139 2012
[29]

Sandra G Hart. 2006. NASA-task load index (NASA-TLX); 20 years later. InProceedings of the human factors and ergonomics society annual meeting, Vol. 50. Sage publications Sage CA: Los Angeles, CA, 904–908. doi:10.1177/ 154193120605000909

work page 2006
[30]

Rashina Hoda. 2021. Socio-technical grounded theory for software engineering.IEEE Transactions on Software Engineering48, 10 (2021), 3808–3832. doi:10.1109/TSE.2021.3106280

work page doi:10.1109/tse.2021.3106280 2021
[31]

Rashina Hoda. 2024. Qualitative research with socio-technical grounded theory.Springer(2024)

work page 2024
[32]

Amber Horvath, Brad Myers, Andrew Macvean, and Imtiaz Rahman. 2022. Using annotations for sensemaking about code. InProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology. 1–16. doi:10.1145/3526113.3545667

work page doi:10.1145/3526113.3545667 2022
[33]

Brittany Johnson, Christian Bird, Denae Ford, Nicole Forsgren, and Thomas Zimmermann. 2023. Make your tools sparkle with trust: The PICSE framework for trust in software tools. In2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 409–419. doi:10.1109/ICSE-SEIP58684.2023.00043

work page doi:10.1109/icse-seip58684.2023.00043 2023
[34]

Matthew Kam, Cody Miller, Miaoxin Wang, Abey Tidwell, Irene A Lee, Joyce Malyn-Smith, Beatriz Perret, Vikram Tiwari, Joshua Kenitzer, Andrew Macvean, et al. 2025. What do professional software developers need to know to succeed in an age of Artificial Intelligence?. InProceedings of the 33rd ACM International Conference on the Foundations of Software Engi...

work page doi:10.1145/3696630.3727251 2025
[35]

Ericson, David Weintrop, and Tovi Grossman

Majeed Kazemitabaar, Justin Chow, Carl Ka To Ma, Barbara J. Ericson, David Weintrop, and Tovi Grossman. 2023. Studying the effect of AI Code Generators on Supporting Novice Learners in Introductory Programming. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23). Association for Computing Machinery, Ne...

work page doi:10.1145/3544548.3580919 2023
[36]

Nan M Laird and James H Ware. 1982. Random-effects models for longitudinal data.Biometrics(1982), 963–974. doi:10.2307/2529876

work page doi:10.2307/2529876 1982
[37]

Paul Luo Li, Amy J Ko, and Jiamin Zhu. 2015. What makes a great software engineer?. In2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. IEEE, 700–710. doi:10.1109/ICSE.2015.335

work page doi:10.1109/icse.2015.335 2015
[38]

Ze Shi Li, Nowshin Nawar Arony, Ahmed Musa Awon, Daniela Damian, and Bowen Xu. 2024. AI tool use and adoption in software development by individuals and organizations: a grounded theory study.arXiv preprint arXiv:2406.17325 (2024). doi:10.48550/arXiv.2406.17325

work page doi:10.48550/arxiv.2406.17325 2024
[39]

Jenny T Liang, Chenyang Yang, and Brad A Myers. 2024. A large-scale survey on the usability of ai programming assistants: Successes and challenges. InProceedings of the 46th IEEE/ACM international conference on software engineering. 1–13. doi:10.1145/3597503.3608128

work page doi:10.1145/3597503.3608128 2024
[40]

Kung-Yee Liang and Scott L Zeger. 1986. Longitudinal data analysis using generalized linear models.Biometrika73, 1 (1986), 13–22. doi:10.1093/biomet/73.1.13

work page doi:10.1093/biomet/73.1.13 1986
[41]

J Scott Long and Laurie H Ervin. 2000. Using heteroscedasticity consistent standard errors in the linear regression model.The American Statistician54, 3 (2000), 217–224. doi:10.1080/00031305.2000.10474549

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1080/00031305.2000.10474549 2000
[42]

Kashumi Madampe, John Grundy, Minh Nguyen, Ellen Welstead-Cloud, Vinh Tuan Huynh, Linh Doan, William Lay, and Sayed Hashim. 2025. EmoReflex: an AI-powered emotion-centric developer insights platform.Automated Software Engineering32, 1 (2025), 22. doi:10.1007/s10515-025-00488-7

work page doi:10.1007/s10515-025-00488-7 2025
[43]

Luciano Marchezan, Wesley K. G. Assunção, Edvin Herac, and Alexander Egyed. 2024. Model-based Maintenance and Evolution with GenAI: A Look into the Future. arXiv:2407.07269 [cs.SE] https://arxiv.org/abs/2407.07269

work page arXiv 2024
[44]

Hussein Mozannar, Gagan Bansal, Adam Fourney, and Eric Horvitz. 2024. Reading between the lines: Modeling user behavior and costs in AI-assisted programming. InProceedings of the CHI Conference on Human Factors in Computing Systems. 1–16. doi:10.1145/3613904.3641936

work page doi:10.1145/3613904.3641936 2024
[45]

Vijayaraghavan Murali, Chandra Maddila, Imad Ahmad, Michael Bolin, Daniel Cheng, Negar Ghorbani, Renuka Fernandez, Nachiappan Nagappan, and Peter C. Rigby. 2024. AI-Assisted Code Authoring at Scale: Fine-Tuning, Deploying, and Mixed Methods Evaluation.Proc. ACM Softw. Eng.1, FSE, Article 48 (July 2024), 20 pages. doi:10.1145/ 3643774

work page 2024
[46]

Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, and Brad Myers. 2024. Using an llm to help with code understanding. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13. Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE113. Publication date: July 2026. How Do Developers Interact with AI? An Explorato...

work page doi:10.1145/3597503.3639187 2024
[47]

Kevin KB Ng, Liyana Fauzi, Leon Leow, and Jaren Ng. 2024. Harnessing the Potential of Gen-AI Coding Assistants in Public Sector Software Development.arXiv preprint arXiv:2409.17434(2024). doi:10.48550/arXiv.2409.17434

work page doi:10.48550/arxiv.2409.17434 2024
[48]

Sydney Nguyen, Hannah McLean Babe, Yangtian Zi, Arjun Guha, Carolyn Jane Anderson, and Molly Q Feldman. 2024. How beginning programmers and code llms (mis) read each other. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–26. doi:10.1145/3613904.3642706

work page doi:10.1145/3613904.3642706 2024
[49]

OpenAI. 2023. ChatGPT. https://openai.com/chatgpt

work page 2023
[50]

OpenAI. 2023. Codex. https://openai.com/codex/

work page 2023
[51]

Stack Overflow. 2023. Insights into Stack Overflow’s traffic. https://stackoverflow.blog/2023/08/08/insights-into-stack- overflows-traffic/ Accessed: 03 August 2023

work page 2023
[52]

It’s Weird That it Knows What I Want

James Prather, Brent N Reeves, Paul Denny, Brett A Becker, Juho Leinonen, Andrew Luxton-Reilly, Garrett Powell, James Finnie-Ansley, and Eddie Antonio Santos. 2023. “It’s Weird That it Knows What I Want”: Usability and Interactions with Copilot for Novice Programmers.ACM Transactions on Computer-Human Interaction31, 1 (2023), 1–31. doi:10.1145/3617367

work page doi:10.1145/3617367 2023
[53]

Asha Rajbhoj, Akanksha Somase, Piyush Kulkarni, and Vinay Kulkarni. 2024. Accelerating software development using generative ai: Chatgpt case study. InProceedings of the 17th innovations in software engineering conference. 1–11. doi:10.1145/3641399.3641403

work page doi:10.1145/3641399.3641403 2024
[54]

Tiernan Ray. 2023. Microsoft has over a million paying Github Copilot users: CEO Nadella. https://www.zdnet.com/ article/microsoft-has-over-a-million-paying-github-copilot-users-ceo-nadella/

work page 2023
[55]

Google Research. 2023. Large sequence models for software development activities. https://research.google/blog/large- sequence-models-for-software-development-activities/ Accessed: 2023

work page 2023
[56]

Diana Robinson, Christian Cabrera, Andrew D Gordon, Neil D Lawrence, and Lars Mennen. 2025. Requirements are all you need: The final frontier for end-user software engineering.ACM Transactions on Software Engineering and Methodology34, 5 (2025), 1–22. doi:10.1145/3708524

work page doi:10.1145/3708524 2025
[57]

Sánchez-García, and Xavier Limón

Alfonso Robles-Aguilar, Jorge Octavio Ocharán-Hernández, Ángel J. Sánchez-García, and Xavier Limón. 2021. Software Design and Artificial Intelligence: A Systematic Mapping Study. In2021 9th International Conference in Software Engineering Research and Innovation (CONISOFT). 132–141. doi:10.1109/CONISOFT52520.2021.00028

work page doi:10.1109/conisoft52520.2021.00028 2021
[58]

Emma Roth. 2024. ChatGPT now has over 300 million weekly users. https://www.theverge.com/2024/12/4/24313097/ chatgpt-300-million-weekly-users

work page 2024
[59]

James A Russell. 1980. A circumplex model of affect.Journal of personality and social psychology39, 6 (1980), 1161. doi:10.1037/h0077714

work page doi:10.1037/h0077714 1980
[60]

Sadra Sabouri, Philipp Eibl, Xinyi Zhou, Morteza Ziyadi, Nenad Medvidovic, Lars Lindemann, and Souti Chattopadhyay

work page
[61]

Devanbu, and Michael Pradel

Trust dynamics in AI-assisted development: Definitions, factors, and implications. InProceedings of the 47th IEEE/ACM international conference on software engineering. doi:10.1109/ICSE55347.2025.00199

work page doi:10.1109/icse55347.2025.00199 2025
[62]

Advait Sarkar and Ian Drosos. 2025. Vibe coding: programming through conversation with artificial intelligence.arXiv preprint arXiv:2506.23253(2025). doi:10.48550/arXiv.2506.23253

work page doi:10.48550/arxiv.2506.23253 2025
[63]

Valerio Terragni, Annie Vella, Partha Roop, and Kelly Blincoe. 2025. The Future of AI-Driven Software Engineering. ACM Transactions on Software Engineering and Methodology(2025). doi:10.1145/3715003

work page doi:10.1145/3715003 2025
[64]

Simon Torka and Sahin Albayrak. 2024. Optimizing AI-Assisted Code Generation. arXiv:2412.10953 [cs.SE] https: //arxiv.org/abs/2412.10953

work page arXiv 2024
[65]

Christoph Treude and Marco A Gerosa. 2025. How Developers Interact with AI: A Taxonomy of Human-AI Collaboration in Software Engineering. InIn Proceedings of the 2nd ACM International Conference on Al Foundation Models and Software Engineering. doi:10.1109/Forge66646.2025.00033

work page doi:10.1109/forge66646.2025.00033 2025
[66]

Ruotong Wang, Ruijia Cheng, Denae Ford, and Thomas Zimmermann. 2024. Investigating and designing for trust in ai-powered code generation tools. InThe 2024 ACM Conference on Fairness, Accountability, and Transparency. 1475–1493. doi:10.1145/3630106.3658984

work page doi:10.1145/3630106.3658984 2024
[67]

Wikipedia. 2025. Vibe Coding. https://en.wikipedia.org/wiki/Vibe_coding Accessed: 2025

work page 2025
[68]

2026.Replication Package

Yinan Wu, Ze Shi Li, Kathryn Thomasset Stolee, and Bowen Xu. 2026.Replication Package. doi:10.5281/zenodo.19582089

work page doi:10.5281/zenodo.19582089 2026
[69]

Tao Xie and Jian Pei. 2006. MAPO: Mining API usages from open source repositories. InProceedings of the 2006 international workshop on Mining software repositories. 54–57. doi:10.1145/1137983.1137997

work page doi:10.1145/1137983.1137997 2006
[70]

Xifeng Yan, Jiawei Han, and Ramin Afshar. 2003. CloSpan: Mining: Closed sequential patterns in large datasets. In Proceedings of the 2003 SIAM international conference on data mining. SIAM, 166–177. doi:10.1137/1.9781611972733.15

work page doi:10.1137/1.9781611972733.15 2003
[71]

Burak Yetiştiren, Işık Özsoy, Miray Ayerdem, and Eray Tüzün. 2023. Evaluating the code quality of ai-assisted code generation tools: An empirical study on github copilot, amazon codewhisperer, and chatgpt.arXiv preprint arXiv:2304.10778(2023). doi:10.48550/arXiv.2304.10778

work page doi:10.48550/arxiv.2304.10778 2023
[72]

Shengcheng Yu, Chunrong Fang, Jia Liu, and Zhenyu Chen. 2025. Test Script Intention Generation for Mobile Application via GUI Image and Code Understanding.ACM Transactions on Software Engineering and Methodology Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE113. Publication date: July 2026. FSE113:24 Yinan Wu, Ze Shi Li, Kathryn Thomasset Stolee, and...

work page doi:10.1145/3722105 2025
[73]

Ilya Zakharov, Ekaterina Koshchenko, and Agnia Sergeyuk. 2025. AI in Software Engineering: Perceived Roles and Their Impact on Adoption. InProceedings of the 33rd ACM International Conference on the Foundations of Software Engineering. 1305–1309. doi:10.1145/3696630.3730563

work page doi:10.1145/3696630.3730563 2025
[74]

Zhengdong Zhang, Zihan Dong, Yang Shi, Thomas Price, Noboru Matsuda, and Dongkuan Xu. 2024. Students’ perceptions and preferences of generative artificial intelligence feedback for programming. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 23250–23258. doi:10.1609/aaai.v38i21.30372

work page doi:10.1609/aaai.v38i21.30372 2024
[75]

Thomas Zimmermann. 2016. Card-sorting: From text to themes. InPerspectives on data science for software engineering. Elsevier, 137–141. doi:10.1016/B978-0-12-804206-9.00027-1 Received 2026-02-24; accepted 2026-03-24 Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE113. Publication date: July 2026

work page doi:10.1016/b978-0-12-804206-9.00027-1 2016