The Bach Doodle: Approachable music composition with machine learning at scale
Pith reviewed 2026-05-24 21:17 UTC · model grok-4.3
The pith
An optimized Coconet model lets users harmonize melodies in Bach style directly in the browser.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We designed the first AI-powered Google Doodle, the Bach Doodle, where users can create their own melody and have it harmonized by a machine learning model Coconet in the style of Bach. For users to input melodies, we designed a simplified sheet-music based interface. To support an interactive experience at scale, we re-implemented Coconet in TensorFlow.js to run in the browser and reduced its runtime from 40s to 2s by adopting dilated depth-wise separable convolutions and fusing operations. We also reduced the model download size to approximately 400KB through post-training weight quantization. We calibrated a speed test based on partial model evaluation time to determine if the harmoniz at
What carries the argument
Coconet model re-implemented in TensorFlow.js using dilated depth-wise separable convolutions, operation fusion, post-training weight quantization, and a speed-test-based switch between local browser and remote TPU execution.
If this is right
- The system processed more than 55 million harmonization queries in three days.
- Users collectively spent 350 years of time interacting with the doodle.
- A public dataset of user compositions and ratings is released for research.
- The optimizations demonstrate how to run generative music models interactively at internet scale.
Where Pith is reading between the lines
- The hybrid local-remote inference approach could extend to other real-time creative AI tools on the web.
- The released dataset may support studies of how non-musicians approach melody writing.
- Similar quantization and convolution changes might enable browser deployment of other sequence models.
Load-bearing premise
That the quantized and re-implemented Coconet model keeps enough musical quality and coherence to produce an engaging experience for users.
What would settle it
If most users gave low ratings to their harmonized outputs or if total engagement time stayed far below 350 years across millions of queries.
read the original abstract
To make music composition more approachable, we designed the first AI-powered Google Doodle, the Bach Doodle, where users can create their own melody and have it harmonized by a machine learning model Coconet (Huang et al., 2017) in the style of Bach. For users to input melodies, we designed a simplified sheet-music based interface. To support an interactive experience at scale, we re-implemented Coconet in TensorFlow.js (Smilkov et al., 2019) to run in the browser and reduced its runtime from 40s to 2s by adopting dilated depth-wise separable convolutions and fusing operations. We also reduced the model download size to approximately 400KB through post-training weight quantization. We calibrated a speed test based on partial model evaluation time to determine if the harmonization request should be performed locally or sent to remote TPU servers. In three days, people spent 350 years worth of time playing with the Bach Doodle, and Coconet received more than 55 million queries. Users could choose to rate their compositions and contribute them to a public dataset, which we are releasing with this paper. We hope that the community finds this dataset useful for applications ranging from ethnomusicological studies, to music education, to improving machine learning models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper describes the Bach Doodle, the first AI-powered Google Doodle, in which users input melodies through a simplified sheet-music interface and receive Bach-style harmonizations from a re-implemented Coconet model running in TensorFlow.js. The authors detail engineering optimizations (dilated depth-wise separable convolutions, operation fusion, post-training quantization to ~400 KB) that reduce runtime from 40 s to 2 s, a speed-test-based hybrid local/remote inference strategy, the resulting engagement (55 M queries, 350 user-years of interaction over three days), and the release of a public dataset of rated user compositions.
Significance. If the optimizations preserve output quality, the work shows that neural music models can be deployed at massive consumer scale via browser engineering, while the released dataset offers a new resource for ethnomusicology, music education, and ML research. The engagement numbers provide concrete evidence of public interest in approachable AI composition tools.
major comments (1)
- [Abstract] Abstract (paragraph on model re-implementation and optimizations): the manuscript reports no objective or subjective evaluation of whether the TensorFlow.js re-implementation, dilated depth-wise separable convolutions, operation fusion, or post-training quantization preserved the musical coherence or stylistic fidelity of the original Coconet model (Huang et al., 2017). No negative log-likelihood on held-out chorales, no listening tests, and no side-by-side comparison are provided. This is load-bearing for the central claim that the deployed system delivers coherent Bach-style harmonizations at scale; usage statistics alone cannot isolate model quality from interface novelty.
Simulated Author's Rebuttal
We thank the referee for their thoughtful and constructive review. We address the major comment on model quality evaluation below.
read point-by-point responses
-
Referee: [Abstract] Abstract (paragraph on model re-implementation and optimizations): the manuscript reports no objective or subjective evaluation of whether the TensorFlow.js re-implementation, dilated depth-wise separable convolutions, operation fusion, or post-training quantization preserved the musical coherence or stylistic fidelity of the original Coconet model (Huang et al., 2017). No negative log-likelihood on held-out chorales, no listening tests, and no side-by-side comparison are provided. This is load-bearing for the central claim that the deployed system delivers coherent Bach-style harmonizations at scale; usage statistics alone cannot isolate model quality from interface novelty.
Authors: We acknowledge that the manuscript does not include new objective or subjective evaluations (e.g., NLL on held-out data or listening tests) of whether the TensorFlow.js re-implementation and optimizations preserved output quality relative to the original Coconet. The paper's central contributions concern the engineering steps that enabled interactive, browser-based inference at massive scale and the resulting public dataset; model performance was established in the 2017 Coconet work. The architectural modifications (dilated depth-wise separable convolutions) were selected to retain receptive field while lowering compute, and post-training quantization plus operation fusion are standard methods expected to incur only minor fidelity loss. We agree, however, that the absence of explicit verification leaves the quality claim under-supported. In revision we will add a short discussion of these design choices and their expected impact on coherence, update the abstract to clarify the paper's scope, and note that the released user ratings offer only indirect, uncontrolled evidence of perceived quality. revision: yes
Circularity Check
No significant circularity: empirical deployment report with no derivations or fitted predictions
full rationale
This paper reports on the design, optimization, and deployment of the Bach Doodle using a pre-existing Coconet model (cited from Huang et al. 2017). It describes engineering changes (dilated depth-wise separable convolutions, operation fusion, post-training quantization) and usage statistics (55M queries, 350 years of playtime) as direct observations. No equations, parameter fitting, predictions, or uniqueness theorems are presented that could reduce to inputs by construction. The self-citation is for the original model definition only and is not load-bearing for any new claim. The paper is self-contained as an engineering case study against external benchmarks of runtime and size.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The Coconet model generates musically appropriate Bach-style harmonizations for arbitrary user melodies.
Reference graph
Works this paper leans on
-
[1]
For users to input melodies, we de- signed a simplified sheet-music based interface
ABSTRACT To make music composition more approachable, we de- signed the first AI-powered Google Doodle, the Bach Doo- dle [1], where users can create their own melody and have it harmonized by a machine learning model (Coconet [22]) in the style of Bach. For users to input melodies, we de- signed a simplified sheet-music based interface. To sup- port an int...
-
[2]
The Bach Doodle: Approachable music com- position with machine learning at scale
INTRODUCTION Machine learning can extend our creative abilities by offer- ing generative models that can rapidly fill in missing parts of our composition, allowing us to see a prototype of how a piece could sound. To celebrate J.S. Bach’s 334th birth- day, we designed the Bach Doodle to create an interactive experience where users can rapidly explore diffe...
work page 2019
-
[3]
The Bach Doodle: Approachable music composition with machine learning at scale
RELATED WORK Machine learning has been used in algorithmic music com- position to support a wide range of musical tasks [5, 13, 19, 28, 29]. Melody harmonization is one of the canonical tasks [7, 11, 20, 26], encourages human-computer interac- tion [3, 14, 21, 25, 33], and is particularly approachable for novices. Different interfaces and tools have been ...
work page internal anchor Pith review Pith/arXiv arXiv 1907
-
[4]
star” button on the left-hand side of the sheet mu- sic, they can enter “advanced mode
THE BACH DOODLE 4.1 A walk through of the user experience The Bach Doodle user experience begins by demonstrat- ing 4-part harmony using two measures of a Bach chorale, Ach wie flüchtig, ach wie nichtig, BWV 26 . By playing the soprano line alone, followed by soprano and alto, and then all four voices, users are shown how the harmony enhances the melody. U...
-
[5]
TECHNICAL CHALLENGES In order for users to interact with Coconet via a web inter- face, we needed to either port it to run client-side on the user’s device or host the model on a server with sufficient speed and capacity to support the number of requests we were expecting. In fact, we did both: we ported the model to TensorFlow.js (TF.js) so that it could ...
-
[6]
DATASET RELEASE AND ANALYSIS 6.1 Data structure Every user who interacted with the Bach Doodle had the opportunity to add their composition to a dataset. We make this entire dataset available at https://g.co/ magenta/bach-doodle-dataset under a Creative Commons license. Of more than 55 million requests, the user contributed dataset contains over 21.6 mill...
-
[7]
CONCLUSION The Bach Doodle enabled large-scale participation in baroque-style counterpoint composition through an intu- itive sheet music interface assisted by machine learning. We hope this encourages more creative apps that allow novices and artists to interact with music composition and machine learning in approachable ways. With this pa- per, we are r...
-
[8]
ACKNOWLEDGEMENTS Many thanks to Ann Yuan, Daniel Smilkov and Nikhil Thorat from Tensorflow.js for their expert assistance. A big shoutout to Pedro Vergani, Rebecca Thomas, Jordan Thompson and others on the Doodle team for their contri- bution to the core components of the Doodle. Thank you Lauren Hannah-Murphy and Chris Han for keeping us on track. Thank y...
-
[9]
https: //www.google.com/doodles/celebrating- johann-sebastian-bach
Celebrating Johann Sebastian Bach. https: //www.google.com/doodles/celebrating- johann-sebastian-bach. Accessed: 2019-04- 04
work page 2019
-
[10]
Harmon- ising chorales by probabilistic inference
Moray Allan and Christopher KI Williams. Harmon- ising chorales by probabilistic inference. Advances in neural information processing systems , 17:25–32, 2005
work page 2005
-
[11]
Omax brothers: a dy- namic yopology of agents for improvization learning
Gérard Assayag, Georges Bloch, Marc Chemillier, Ar- shia Cont, and Shlomo Dubnov. Omax brothers: a dy- namic yopology of agents for improvization learning. In Proceedings of the 1st ACM workshop on Audio and music computing multimedia , pages 125–132. ACM, 2006
work page 2006
-
[12]
Nicolas Boulanger-Lewandowski, Yoshua Bengio, and Pascal Vincent. Modeling temporal dependencies in high-dimensional sequences: Application to poly- phonic music generation and transcription. Interna- tional Conference on Machine Learning, 2012
work page 2012
-
[13]
arXiv preprint arXiv:1709.01620 (2017)
Jean-Pierre Briot, Gaëtan Hadjeres, and François Pa- chet. Deep learning techniques for music generation-a survey. arXiv preprint arXiv:1709.01620, 2017
-
[14]
Xception: Deep learning with depth- wise separable convolutions
François Chollet. Xception: Deep learning with depth- wise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1251–1258, 2017
work page 2017
-
[15]
A hybrid system for automatic generation of style-specific accompani- ment
Ching-Hua Chuan and Elaine Chew. A hybrid system for automatic generation of style-specific accompani- ment. In Proceedings of the 4th International Joint Workshop on Computational Creativity, pages 57–64. Goldsmiths, University of London London, 2007
work page 2007
-
[16]
David Cope. Computers and musical style . Oxford University Press, 1991
work page 1991
-
[17]
mu- sic21: A toolkit for computer-aided musicology and symbolic music data
Michael Scott Cuthbert and Christopher Ariza. mu- sic21: A toolkit for computer-aided musicology and symbolic music data. In International Society for Mu- sic Information Retrieval, 2010
work page 2010
-
[18]
Consecutive 5ths and octaves in bach chorales
Luke Dahn. Consecutive 5ths and octaves in bach chorales. https://lukedahn.wordpress.com/ 2016/04/15/consecutive-5ths-and- octaves-in-bach-chorales/ . Accessed: 2019-04-12
work page 2016
-
[19]
Analysis and syn- thesis of palestrina-style counterpoint using markov chains
Mary Farbood and Bernd Schöner. Analysis and syn- thesis of palestrina-style counterpoint using markov chains. In Proceedings of the International Computer Music Conference, 2001
work page 2001
-
[20]
Hyperscore: a graphical sketchpad for novice composers
Morwaread M Farbood, Egon Pasztor, and Kevin Jen- nings. Hyperscore: a graphical sketchpad for novice composers. IEEE Computer Graphics and Applica- tions, 24(1):50–54, 2004
work page 2004
-
[21]
Ai methods in algorithmic composition: A comprehensive survey
Jose D Fernández and Francisco Vico. Ai methods in algorithmic composition: A comprehensive survey. Journal of Artificial Intelligence Research , 48:513– 582, 2013
work page 2013
-
[22]
Rebecca Anne Fiebrink. Real-time human interaction with supervised learning algorithms for music compo- sition and performance. PhD dissertation, Princeton University, 2011
work page 2011
-
[23]
Parallel succes- sions of perfect fifths in the bach chorales
George Fitsioris and Darrell Conklin. Parallel succes- sions of perfect fifths in the bach chorales. MUSICAL STRUCTURE, page 52, 2008
work page 2008
-
[24]
Poly- phonic music generation by modeling temporal depen- dencies using a RNN-DBN
Kratarth Goel, Raunaq V ohra, and JK Sahoo. Poly- phonic music generation by modeling temporal depen- dencies using a RNN-DBN. In International Confer- ence on Artificial Neural Networks, 2014
work page 2014
-
[25]
Deepbach: a steerable model for bach chorales gener- ation
Gaëtan Hadjeres, François Pachet, and Frank Nielsen. Deepbach: a steerable model for bach chorales gener- ation. In International Conference on Machine Learn- ing, pages 1362–1371, 2017
work page 2017
-
[26]
Style Imitation and Chord Invention in Polyphonic Music with Exponential Families
Gaëtan Hadjeres, Jason Sakellariou, and François Pa- chet. Style imitation and chord invention in poly- phonic music with exponential families.arXiv preprint arXiv:1609.05152, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[27]
A functional taxonomy of music generation systems
Dorien Herremans, Ching-Hua Chuan, and Elaine Chew. A functional taxonomy of music generation systems. ACM Computing Surveys (CSUR) , 50(5):69, 2017
work page 2017
-
[28]
Composing fifth species counterpoint music with a variable neigh- borhood search algorithm
Dorien Herremans and Kenneth Sörensen. Composing fifth species counterpoint music with a variable neigh- borhood search algorithm. Expert systems with appli- cations, 40(16):6427–6437, 2013
work page 2013
-
[29]
Mixed-initiative generation of multi- channel sequential structures
Cheng-Zhi Anna Huang, Sherol Chen, Mark Nelson, and Doug Eck. Mixed-initiative generation of multi- channel sequential structures. In International Con- ference on Learning Representations Workshop Track, 2018
work page 2018
-
[30]
Cheng-Zhi Anna Huang, Tim Cooijmnas, Adam Roberts, Aaron Courville, and Douglas Eck. Counter- point by convolution. ISMIR, 2017
work page 2017
-
[31]
Sequence tutor: Conservative fine-tuning of sequence generation models with kl-control
Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, José Miguel Hernández-Lobato, Richard E Turner, and Douglas Eck. Sequence tutor: Conservative fine-tuning of sequence generation models with kl-control. In Pro- ceedings of the 34th International Conference on Ma- chine Learning-Volume 70, pages 1645–1654. JMLR. org, 2017
work page 2017
-
[32]
Bachbot: Automatic composition in the style of bach chorales
Feynman Liang. Bachbot: Automatic composition in the style of bach chorales. Masters thesis, University of Cambridge, 2016
work page 2016
-
[33]
The continuator: Musical interaction with style
Francois Pachet. The continuator: Musical interaction with style. Journal of New Music Research, 32(3):333– 341, 2003
work page 2003
-
[34]
Musical harmoniza- tion with constraints: A survey.Constraints, 6(1):7–19, 2001
François Pachet and Pierre Roy. Musical harmoniza- tion with constraints: A survey.Constraints, 6(1):7–19, 2001
work page 2001
-
[35]
Assisted lead sheet composition using flowcom- poser
Alexandre Papadopoulos, Pierre Roy, and François Pa- chet. Assisted lead sheet composition using flowcom- poser. In International Conference on Principles and Practice of Constraint Programming, pages 769–785. Springer, 2016
work page 2016
-
[36]
Ai meth- ods for algorithmic composition: A survey, a critical view and future prospects
George Papadopoulos and Geraint Wiggins. Ai meth- ods for algorithmic composition: A survey, a critical view and future prospects. In AISB Symposium on Mu- sical Creativity , volume 124, pages 110–117. Edin- burgh, UK, 1999
work page 1999
-
[37]
An introduction to musical metacre- ation
Philippe Pasquier, Arne Eigenfeldt, Oliver Bown, and Shlomo Dubnov. An introduction to musical metacre- ation. Computers in Entertainment (CIE) , 14(2):2, 2016
work page 2016
- [38]
-
[39]
Mysong: au- tomatic accompaniment generation for vocal melodies
Ian Simon, Dan Morris, and Sumit Basu. Mysong: au- tomatic accompaniment generation for vocal melodies. In Proceedings of the SIGCHI conference on human factors in computing systems , pages 725–734. ACM, 2008
work page 2008
-
[40]
TensorFlow.js: Machine Learning for the Web and Beyond
Daniel Smilkov, Nikhil Thorat, Yannick Assogba, Ann Yuan, Nick Kreeger, Ping Yu, Kangyi Zhang, Shanqing Cai, Eric Nielsen, David Soergel, et al. Tensorflow. js: Machine learning for the web and beyond. arXiv preprint arXiv:1901.05350, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1901
-
[41]
Ma- chine learning research that matters for music cre- ation: A case study
Bob L Sturm, Oded Ben-Tal, Una Monaghan, Nick Collins, Dorien Herremans, Elaine Chew, Gaëtan Had- jeres, Emmanuel Deruty, and François Pachet. Ma- chine learning research that matters for music cre- ation: A case study. Journal of New Music Research , 48(1):36–55, 2019
work page 2019
-
[42]
Neural autoregres- sive distribution estimation
Benigno Uria, Marc-Alexandre Côté, Karol Gregor, Iain Murray, and Hugo Larochelle. Neural autoregres- sive distribution estimation. The Journal of Machine Learning Research, 17(1):7184–7220, 2016
work page 2016
-
[43]
A deep and tractable density estimator
Benigno Uria, Iain Murray, and Hugo Larochelle. A deep and tractable density estimator. InIn Proceedings of the International Conference on Machine Learning, 2014
work page 2014
-
[44]
Wavenet: A generative model for raw audio
Aäron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W Senior, and Koray Kavukcuoglu. Wavenet: A generative model for raw audio
-
[45]
On the equivalence between deep nade and generative stochastic networks
Li Yao, Sherjil Ozair, Kyunghyun Cho, and Yoshua Bengio. On the equivalence between deep nade and generative stochastic networks. In In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2014
work page 2014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.