arxiv: 2605.14016 · v1 · submitted 2026-05-13 · 💻 cs.SE · cs.SD

Recognition: no theorem link

Case Studies and Reflections on Agentic Software Engineering for Rapid Development of Digital Music Instruments

Matthew John Yee-King

Authors on Pith no claims yet

Pith reviewed 2026-05-15 04:51 UTC · model grok-4.3

classification 💻 cs.SE cs.SD

keywords agentic software engineeringdigital music instrumentsJUCE frameworkaudio pluginscase studiesMusic MouseContinuatorautoethnography

0 comments

The pith

Agentic software engineering enables rapid development of interoperable digital music instruments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how agentic software engineering can address challenges of longevity, interoperability, and high barriers to entry in digital music instrument creation. It details three case studies in which an experienced developer used AI coding agents to re-implement Laurie Spiegel's Music Mouse as a JUCE plugin, translate Pachet's Continuator from Python to C++, and add a new 3D OpenGL interface to an existing tracker sequencer. Autoethnographic reflections on the prompt logs identify practical techniques that made the process effective. A sympathetic reader would see this as evidence that the method could let more developers produce maintainable audio software.

Core claim

By directing agentic software engineering tools to generate and refine C++ code within the JUCE framework, the developer produced working versions of Music Mouse and Continuator as native plugins and extended a tracker sequencer with a three-dimensional user interface, while recording prompt interactions that revealed workable patterns for audio-software tasks.

What carries the argument

Agentic software engineering (ASE), in which large-language-model agents generate, debug, and iterate on code in response to human prompts.

If this is right

Legacy music systems can be re-implemented as modern native plugins more quickly.
Cross-language translation of audio algorithms becomes feasible without full manual rewriting.
New interface elements such as 3D views can be added to existing sequencers with reduced effort.
Prompt-log patterns can serve as templates for similar audio-development projects.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Musicians without programming backgrounds might eventually build custom instruments directly through guided agents.
The same prompting patterns could shorten development cycles in other real-time creative software domains.
Long-term tracking of the resulting plugins would be needed to confirm claims of improved longevity.

Load-bearing premise

The experiences and prompt logs from three projects by one experienced developer will generalize to effective practices usable by non-programmer musicians.

What would settle it

A trial in which non-programmer musicians attempt the same three tasks with ASE tools and produce non-functional or non-interoperable plugins.

Figures

Figures reproduced from arXiv: 2605.14016 by Matthew John Yee-King.

**Figure 1.** Figure 1: NIMEs programmed by Codex. Abstract The article explores the use of agentic software engineering (ASE) in the development of innovative audio software. It begins with a review of background work that lays out the challenges of longevity, interoperability and barriers to entry in digital music instrument creation, explaining recent developments in ASE and highlighting the possibility that ASE can lower barr… view at source ↗

**Figure 3.** Figure 3: Project file structure for WebView starter tem [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 2.** Figure 2: Interacting with codex in the VSCode extension. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 4.** Figure 4: Music Mouse running on an emulated Atari ST; [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Timelines for the three case studies: Music Mouse (top), Continuator (middle), OpenGL tracker interface (bottom). [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Four iterations of the Music Mouse reimplementation, clockwise from top left. V1 approximates the user interface with a mostly working mouse to MIDI interaction and keyboard controls; V2 includes the piano display, V3 adds graphical indications of the note positions and V4 enhances the horizontal and vertical bars. for Music Mouse. Main phases in the session were: project preparation, major prompt, plugi… view at source ↗

**Figure 7.** Figure 7: Original Python Continuator interface (left) then [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 8.** Figure 8: Four iterations of the tracker UI: original JUCE [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗

read the original abstract

The article explores the use of agentic software engineering (ASE) in the development of innovative audio software. It begins with a review of background work that lays out the challenges of longevity, interoperability and barriers to entry in digital music instrument creation, explaining recent developments in ASE and highlighting the possibility that ASE can lower barriers to entry and facilitate creation of interoperable software with greater longevity. Following that, we present case studies wherein we used ASE technology in three distinct ways to develop audio software in the C++ language with the JUCE framework. In case study 1, we re-implement Laurie Spiegel's `Music Mouse' software as a native plugin. In case study 2, we translate Pachet's `Continuator' system from Python into a native plugin. In case study 3, we develop a new 3D user interface for an existing `tracker' sequencer using OpenGL. We describe the experiences of the human developer in the case studies via autoethnographic discussion of the prompt logs and snapshots of the software as it was developed. We identify effective practice for ASE use in this domain and suggest future steps for the work involving evaluation of the method with non-programmer musicians.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Three original case studies apply agentic AI to JUCE music plugin development with transparent process logs, but claims about lowered barriers rest on one experienced developer's autoethnography without baselines or non-programmer tests.

read the letter

The paper's main value is the three concrete case studies: reimplementing Music Mouse as a JUCE plugin, porting Continuator from Python, and building a 3D OpenGL tracker UI. These are new applications of agentic software engineering to audio work, and the autoethnographic write-up with prompt logs and development snapshots gives a clear view of what the process looked like day to day. The background section on longevity and interoperability issues in digital instruments is also useful context. The reflections on effective prompting practices for C++ and JUCE feel grounded in the actual work done. The soft spots are straightforward. All the development was done by one experienced programmer, so there are no direct comparisons to traditional coding effort, no metrics on time saved or code maintainability, and no involvement of the non-programmer musicians the paper says would benefit most. The broader claims about reduced entry barriers and longer-lived software therefore stay tentative, which the manuscript itself notes by calling for future evaluation. The citation pattern is reasonable and draws on relevant prior work without obvious omissions. This is the kind of practical report that developers already working in music tech or AI-assisted coding would find immediately usable for trying similar approaches. It is not a rigorous evaluation study, but the case studies are honest and detailed enough that a serious editor should send it to peer review so reviewers can push on the generalization and suggest concrete ways to add baselines or user tests.

Referee Report

2 major / 2 minor

Summary. The paper explores the application of agentic software engineering (ASE) to the rapid development of digital music instruments. It reviews challenges in longevity, interoperability, and barriers to entry in the field, then presents three autoethnographic case studies using ASE with C++ and JUCE: re-implementing Music Mouse as a native plugin, porting the Continuator system from Python, and developing a new 3D OpenGL user interface for a tracker sequencer. Through reflections on prompt logs and development snapshots, it identifies effective practices and suggests that ASE can lower barriers and improve software qualities, with recommendations for future evaluation with non-programmer musicians.

Significance. If the observations hold, this work provides valuable practical insights and reflections on integrating ASE into creative audio software development, highlighting potential for faster prototyping and better interoperability. The autoethnographic approach offers detailed process descriptions that could inform practitioners, though the lack of quantitative metrics and external validation limits the strength of claims about lowered barriers and generalizability to non-programmers.

major comments (2)

Abstract and introduction: The central claim that ASE lowers barriers to entry and facilitates interoperable software with greater longevity rests on autoethnographic reflections from a single experienced developer across the three case studies; this is insufficient to support generalization without comparative baselines or participation by non-programmer musicians, as the manuscript itself notes in its future-work section.
Case studies section (all three): No quantitative metrics are reported for development effort, code maintainability, interoperability, or longevity (e.g., no comparisons of time-to-completion, error rates, or post-development usage against non-ASE baselines), leaving the claims of rapid development and improved software qualities supported only by subjective prompt-log discussion.

minor comments (2)

The prompt logs and software snapshots could be presented with clearer formatting or tables to improve readability and allow readers to trace specific ASE interactions to outcomes.
Background section: A brief comparison table of traditional vs. ASE-assisted development challenges in DMIs would help contextualize the case-study contributions.

Simulated Author's Rebuttal

2 responses · 2 unresolved

We thank the referee for the constructive and detailed review. We agree that the work is exploratory and autoethnographic, and we will revise the manuscript to more clearly delimit the scope of our claims, emphasize limitations, and avoid overgeneralization while retaining the value of the detailed process reflections.

read point-by-point responses

Referee: Abstract and introduction: The central claim that ASE lowers barriers to entry and facilitates interoperable software with greater longevity rests on autoethnographic reflections from a single experienced developer across the three case studies; this is insufficient to support generalization without comparative baselines or participation by non-programmer musicians, as the manuscript itself notes in its future-work section.

Authors: We fully agree that the observations derive from a single experienced developer using autoethnography and therefore cannot support broad generalizations about lowered barriers or improved longevity without further validation. The manuscript already flags this limitation in the future-work section. In revision we will temper the abstract and introduction to frame the findings as preliminary insights and effective-practice suggestions rather than definitive claims, while preserving the concrete case-study descriptions. revision: partial
Referee: Case studies section (all three): No quantitative metrics are reported for development effort, code maintainability, interoperability, or longevity (e.g., no comparisons of time-to-completion, error rates, or post-development usage against non-ASE baselines), leaving the claims of rapid development and improved software qualities supported only by subjective prompt-log discussion.

Authors: The case studies are deliberately qualitative, relying on autoethnographic analysis of prompt logs and development snapshots. No quantitative metrics (time-to-completion, error rates, maintainability scores, or controlled baselines) were collected because the study design focused on process reflection rather than comparative measurement. We will add an explicit statement in the case-studies section clarifying the qualitative character of the evidence and its attendant limitations. revision: partial

standing simulated objections not resolved

Adding quantitative metrics or controlled comparisons against non-ASE baselines would require a new study design with timed tasks, multiple developers, and objective code-quality measures, which cannot be performed in a revision of the present manuscript.
Involving non-programmer musicians for validation would necessitate participant recruitment, ethics approval, and a separate evaluation protocol that lies outside the scope of revising this autoethnographic paper.

Circularity Check

0 steps flagged

No circularity: empirical case studies with no derivations or self-referential reductions

full rationale

The manuscript reports three autoethnographic case studies of ASE use in C++/JUCE audio software development (re-implementing Music Mouse, porting Continuator, building a 3D OpenGL tracker UI) together with reflections on prompt logs. No equations, fitted parameters, predictions, uniqueness theorems, or ansatzes appear. Claims that ASE lowers barriers rest on direct description of the author's observed workflow rather than any reduction to prior inputs by construction or load-bearing self-citation. The text itself flags the absence of non-programmer evaluation and quantitative baselines, confirming the argument is presented as reflective experience rather than a closed deductive loop.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work is qualitative and relies on standard assumptions from software engineering and HCI research rather than formal axioms, free parameters, or invented entities.

pith-pipeline@v0.9.0 · 5508 in / 1109 out tokens · 35714 ms · 2026-05-15T04:51:16.410370+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 1 internal anchor

[1]

Mohamad Abou Ali, Fadi Dornaika, and Jinan Charafeddine. 2025. Agen- tic AI: A Comprehensive Survey of Architectures, Applications, and Fu- ture Directions.Artificial Intelligence Review59, 1 (Nov. 2025), 11. https: //doi.org/10.1007/s10462-025-11422-4

work page doi:10.1007/s10462-025-11422-4 2025
[2]

Antoine Caillon and Philippe Esling. 2022. Streamable Neural Audio Synthesis With Non-Causal Convolutions. https://doi.org/10.48550/arXiv.2204.07064 arXiv:2204.07064 [cs]

work page doi:10.48550/arxiv.2204.07064 2022
[3]

Wanderley

Filipe Calegario, João Tragtenberg, Christian Frisson, Eduardo Meneses, Joseph Malloch, Vincent Cusson, and Marcelo M. Wanderley. 2021. Documentation and Replicability in the NIME Community. InNIME 2021. PubPub

work page 2021
[4]

Nick Collins. 2016. Live Coding and Teaching SuperCollider.Journal of Music, Technology & Education9, 1 (May 2016), 5–16. https://doi.org/10.1386/jmte.9. 1.5_1

work page doi:10.1386/jmte.9 2016
[5]

Collins English Dictionary. 2025. VIBE CODING Definition and Meaning

work page 2025
[6]

Dannenberg

Roger B. Dannenberg. 2018. Languages for Computer Music.Frontiers in Digital Humanities5 (Nov. 2018). https://doi.org/10.3389/fdigh.2018.00026

work page doi:10.3389/fdigh.2018.00026 2018
[7]

Georgios Diapoulis. 2024. Teaching Strudel to Young Girls: Realizing Live Coding through Performance Practice

work page 2024
[8]

Liam Donovan, S. M. Bin, Jack Armitage, and Andrew P. McPherson. 2017. Building an IDE for an Embedded System Using Web Technologies. (2017)

work page 2017
[9]

Hassan, Hao Li, Dayi Lin, Bram Adams, Tse-Hsun Chen, Yutaro Kashiwa, and Dong Qiu

Ahmed E. Hassan, Hao Li, Dayi Lin, Bram Adams, Tse-Hsun Chen, Yutaro Kashiwa, and Dong Qiu. 2025. Agentic Software Engineering: Foundational Pillars and a Research Roadmap. https://doi.org/10.48550/arXiv.2509.06216 arXiv:2509.06216 [cs]

work page doi:10.48550/arxiv.2509.06216 2025
[10]

Smith III

Julius O. Smith III. 2024. Using AI to Port Python’s SCIPY.SIGNAL Filter- Related Functions to C++ for Use in Real Time

work page 2024
[11]

Stéphane Letz, Romain Michon, and Yann Orlarey. 2024. WHAT’S NEW IN THE FAUST ECOSYSTEM IN 2024?. InInternational Faust Conference

work page 2024
[12]

Raul Masu, Fabio Morreale, and Alexander Refsum. 31–2023. The O in NIME: Reflecting on the Importance of Reusing and Repurposing Old Musical Instru- ments. InNew Interfaces for Musical Expression 2023. Mexico City

work page 2023
[13]

Grierson

Louis McCallum and Mick S. Grierson. 2020. Supporting Interactive Ma- chine Learning Approaches to Building Musical Instruments in the Browser. InProceedings of the International Conference on New Interfaces for Musical Expression. 271–272

work page 2020
[14]

Alex McLean. 2014. Making Programming Languages to Dance to: Live Coding with Tidal. InProceedings of the 2nd ACM SIGPLAN International Workshop on Functional Art, Music, Modeling & Design. ACM, Gothenburg Sweden, 63–70. https://doi.org/10.1145/2633638.2633647

work page doi:10.1145/2633638.2633647 2014
[15]

Javier Nistal, Cyran Aouameur, Ithan Velarde, and Stefan Lattner. 2022. Drum- GAN VST: A Plugin for Drum Sound Analysis/Synthesis With Autoencoding Generative Adversarial Networks. https://doi.org/10.48550/arXiv.2206.14723 arXiv:2206.14723 [cs, eess]

work page doi:10.48550/arxiv.2206.14723 2022
[16]

Le, Sai Sree Laya Chukka- palli, Teryl Taylor, Ian M

Grigoris Ntousakis, Julian James Stephen, Michael V. Le, Sai Sree Laya Chukka- palli, Teryl Taylor, Ian M. Molloy, and Frederico Araujo. 2025. Securing MCP- based Agent Workflows. InProceedings of the 4th Workshop on Practical Adop- tion Challenges of ML for Systems (PACMI ’25). Association for Computing Ma- chinery, New York, NY, USA, 50–55. https://doi....

work page doi:10.1145/3766882.3767177 2025
[17]

Francois Pachet. 2003. The Continuator: Musical Interaction With Style. Journal of New Music Research32, 3 (Sept. 2003), 333–341. https://doi.org/10. 1076/jnmr.32.3.333.16861

work page 2003
[18]

François Pachet and Pierre Roy. 2011. Markov Constraints: Steerable Gen- eration of Markov Sequences.Constraints16, 2 (April 2011), 148–172. https://doi.org/10.1007/s10601-010-9101-4

work page doi:10.1007/s10601-010-9101-4 2011
[19]

Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki Van Stein, and Thomas Bäck. 2026. Multi-Step Reasoning with Large Language Models, a Survey.Comput. Surveys58, 6 (April 2026), 1–35. https://doi.org/10.1145/ 3774896

work page 2026
[20]

Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Lauren Hong, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, and Maosong Sun. 2023. ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs. https://doi.org/10.48550/arXiv.2307.16789 ar...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2307.16789 2023
[21]

Markku Reunanen. [n. d.].Trackers: The Rise, Bloom and Later Developments of a Paradigm. Technical Report

work page
[22]

Felix Roos and Alex McLean. 2023. Strudel: Live Coding Patterns on the Web. InProceedings of the 7th International Conference on Live Coding

work page 2023
[23]

1986.Music Mouse™- An Intelligent Instrument Version for Atari ST

Laurie Spiegel. 1986.Music Mouse™- An Intelligent Instrument Version for Atari ST

work page 1986
[24]

2018.Stability, Reliability, Compatibil- ity: Reviewing 40 Years of NIME Design

John Sullivan and Marcelo Wanderley. 2018.Stability, Reliability, Compatibil- ity: Reviewing 40 Years of NIME Design. Ph. D. Dissertation. McGill Univer- sity/Université McGill

work page 2018
[25]

2020.Audio Education: Theory, Culture, and Practice

Daniel Walzer and Mariana Lopez. 2020.Audio Education: Theory, Culture, and Practice. Routledge

work page 2020
[26]

Matthew Yee-King and Mark d’Inverno. 2024. Strategies for Building AI- enhanced Audio Software with Impact. InAIMC 2024. Goldsmiths, University of London

work page 2024
[27]

Michael Zbyszynski, Mick Grierson, Matthew Yee-King, and Leon Fedden

work page
[28]

InICMC 2017

Write Once Run Anywhere Revisited: Machine Learning and Audio Tools in the Browser with C++ and Emscripten. InICMC 2017. Goldsmiths, University of London

work page 2017
[29]

William Zhang, Maria Leon, Ryan Xu, Adrian Cardenas, Amelia Wissink, Hanna Martin, Maya Srikanth, Kaya Dorogi, Christian Valadez, Pedro Perez, Citlalli Grijalva, Corey Zhang, and Mark Santolucito. 2024. Benchmarking LLM Code Generation for Audio Programming with Visual Dataflow Languages. https://doi.org/10.48550/arXiv.2409.00856 arXiv:2409.00856 [cs]

work page doi:10.48550/arxiv.2409.00856 2024