Music transcription modelling and composition using deep learning

Bob L. Sturm , Jo\~ao Felipe Santos , Oded Ben-Tal , Iryna Korshunova

Authors on Pith no claims yet

classification 💻 cs.SD cs.LG

keywords musictranscriptionscompositiontranscriptionleveldeepgeneratedlearning

read the original abstract

We apply deep learning methods, specifically long short-term memory (LSTM) networks, to music transcription modelling and composition. We build and train LSTM networks using approximately 23,000 music transcriptions expressed with a high-level vocabulary (ABC notation), and use them to generate new transcriptions. Our practical aim is to create music transcription models useful in particular contexts of music composition. We present results from three perspectives: 1) at the population level, comparing descriptive statistics of the set of training transcriptions and generated transcriptions; 2) at the individual level, examining how a generated transcription reflects the conventions of a music practice in the training transcriptions (Celtic folk); 3) at the application level, using the system for idea generation in music composition. We make our datasets, software and sound examples open and available: \url{https://github.com/IraKorshunova/folk-rnn}.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ONOTE: Benchmarking Omnimodal Notation Processing for Expert-level Music Intelligence
cs.SD 2026-04 unverdicted novelty 7.0

ONOTE is a multi-format benchmark that applies a deterministic pipeline to expose a disconnect between perceptual accuracy and music-theoretic comprehension in leading omnimodal AI models.
Anchored Cyclic Generation: A Novel Paradigm for Long-Sequence Symbolic Music Generation
cs.SD 2026-04 unverdicted novelty 7.0

Anchored Cyclic Generation uses anchor features from known music to mitigate error accumulation in autoregressive models, with the Hi-ACG framework delivering better long-sequence symbolic music and music completion p...