arxiv: 2605.04622 · v1 · submitted 2026-05-06 · 💻 cs.LG · cs.AI· cs.SC

Recognition: unknown

Library learning with e-graphs on jazz harmony

Zeng Ren , Maddy Bowers , Xinyi Guan , Martin Rohrmeier

Authors on Pith no claims yet

Pith reviewed 2026-05-08 17:20 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.SC

keywords library learninge-graphsjazz harmonyprogram inductionharmonic patternsdeductive parsingmusic cognitionrefactoring

0 comments

The pith

A model discovers libraries of harmonic patterns by learning compact programs for jazz chord progressions via e-graphs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a computational model that treats jazz harmonic progressions as programs built from basic relations among chords and then searches for a shared library of reusable patterns that makes those programs shorter. It first generates candidate programs for each piece through enumeration and deductive parsing, then applies library learning on e-graphs to refactor the programs and extract common substructures at the same time. The joint search produces both a set of refactored programs and an accompanying library of harmonic abstractions. These outputs are checked against criteria of intuitiveness and closeness to how human analysts write harmonic derivations. Success would mean the model reproduces the kind of compression and abstraction humans appear to perform when internalizing musical patterns.

Core claim

Given a corpus of jazz progressions, the model enumerates possible programs composed of primitive harmonic relations for each piece, then jointly optimizes a library of common harmonic patterns together with refactored versions of the programs by integrating deductive parsing with library learning on e-graphs, yielding structures that exhibit similarities to human-written harmonic derivations.

What carries the argument

Joint library learning on e-graphs that simultaneously refactors programs over harmonic primitives and extracts a shared library of patterns.

If this is right

The learned libraries identify recurring harmonic patterns that appear across multiple progressions in the corpus.
The refactored programs supply shorter generative explanations than the initial enumerated programs.
Evaluations of the programs and libraries indicate alignment with human intuition and expert analyses.
The approach reproduces aspects of the iterative reflection process by which humans internalize musical patterns.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same e-graph library-learning loop could be applied to sequential patterns in other domains such as tonal harmony in classical music or syntactic structures in language.
Human music cognition may operate in part by building and reusing libraries of compressed sub-programs in the manner the model demonstrates.
The primitive relations chosen for the model could be tested directly against standard music-theory textbooks to check whether the induced libraries match taught abstractions.
Scaling the method to larger corpora would test whether the same small set of primitives continues to suffice or whether new primitives must be introduced.

Load-bearing premise

Jazz harmonic progressions can be represented as compositions of a small set of primitive harmonic relations whose joint optimization over programs and libraries produces human-intuitive results.

What would settle it

Applying the model to the jazz corpus and obtaining libraries or refactored programs that show no measurable similarity to human harmonic derivations or fail to compress the corpus would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.04622 by Maddy Bowers, Martin Rohrmeier, Xinyi Guan, Zeng Ren.

**Figure 1.** Figure 1: One naive adaptation of library learning (highlighted in blue) to account for chord progression as input (instead of view at source ↗

**Figure 2.** Figure 2: Representation of harmonic relations. are added to the e-graph as rewrite rules, and we perform equality saturation [28, 32]: the graph is expanded by exhaustively applying the rewrites until no new equivalent expressions can be derived. This produces a saturated graph 𝒢 ′ that compactly represents not only all original derivations, but also all equivalent derivations obtainable using the candidate abstrac… view at source ↗

**Figure 3.** Figure 3: We mitigate the combinatorial explosion by adapting the view at source ↗

**Figure 4.** Figure 4: The model output derivations for the three pieces using the learned abstractions in the library. Question marks view at source ↗

**Figure 5.** Figure 5: Representative egglog inference rules for the parsing stage Au-Rules 𝑥, 𝑥𝑖 , 𝑦𝑖 , 𝑡𝑋, 𝑡𝑌 : Template 𝑟𝑋, 𝑟𝑌 : Primitive relation Success Base 𝑡𝑋 = Pure r AntiUnify tX tX :=∪ {𝑡𝑋 } IsAntiUnifier tX tX tX Fail Base 𝑡𝑋 = Pure rX 𝑡𝑌 = Pure rY 𝑟𝑋 ≠ 𝑟𝑌 AntiUnify tX tY :=∪ {Id} IsAntiUnifier Id tX tY Inductive Case Success 𝑡𝑋 = 𝑥 ◦ [x1 ... xn] 𝑡𝑌 = 𝑥 ◦ [y1 ... yn] IsFinalizedAntiUnifier t1 x1 y1 ... IsFinalizedAnt… view at source ↗

**Figure 6.** Figure 6: Anti-unification rules. Antiunifiers are set to be finalized after these rules saturates to fixed point. This cycle between view at source ↗

**Figure 7.** Figure 7: babble’s Cost set analysis implemented via inference rules. Prune and Reduce are two set filtering function from babble. Reduce eliminates provably worse pairs according to a partial order, and Prune is an auxiliary function for beam search to keep only the top k candidate according to use cost. NodeNeeded is a function that precalculate e-nodes that requires analysis value in a e-class, and NodeProcessed … view at source ↗

read the original abstract

Humans can acquire a highly structured intuitive understanding of musical patterns, yet these patterns often require multiple iterations of reflection and re-listening to internalize fully. To capture such an internalization process, we present a computational model for the learning of jazz harmonic patterns based on library learning. Given a corpus of harmonic progressions, our model searches over a space of programs composed of primitive harmonic relations in order to discover concise generative explanations of the corpus. The model first enumerates possible programs for each piece, and then jointly learns a library of harmonic patterns and refactored programs. To efficiently navigate the vast joint space of programs and libraries, we integrate deductive parsing with library learning on e-graphs. We explore how well our model captures aspects of human musical pattern learning by evaluating the intuitiveness of both programs and libraries, as well as similarities to human-written harmonic derivations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

read the letter

The paper gives a concrete integration of deductive parsing into e-graph library learning to jointly optimize programs and a shared library over jazz progressions, with exploratory comparisons to human derivations. The primitives are a small fixed set of harmonic relations, e-graphs share substructure across candidate programs, and the search produces refactored programs plus an induced library that the authors then check against human-written analyses. This setup directly tackles the combinatorial cost of searching programs and libraries together, which is the main technical move. The motivation ties to iterative human internalization of patterns, and the stress-test note shows no internal contradictions or circular definitions in the argument. The results are presented as exploratory rather than definitive proof of human-likeness, which keeps the claims proportionate. What stands out is the explicit engineering: enumeration of programs, e-graph construction, and joint optimization all described at a level that lets a reader see how the pieces fit. The human-comparison angle is a reasonable way to test whether the induced structures feel intuitive. The softer spots sit in the evaluation. The strength of the similarities to human derivations depends on corpus size, exact comparison protocol, and whether the metrics are quantitative or mostly qualitative. If those turn out small-scale or subjective, the human-likeness claim stays tentative. The fixed primitive set is acknowledged as an approximation and necessarily drops some jazz nuance. This is useful for researchers doing program induction or library learning on symbolic sequences and for computational musicology people who want to see the technique applied to harmony. It is not a core advance in the synthesis algorithms themselves but a clear domain application. I would bring it to a reading group to walk through the e-graph implementation and the evaluation design. It deserves peer review because the technical description is solid enough for referees to judge the claims and give targeted feedback on both the AI and music sides.

Referee Report

0 major / 0 minor

Summary. The paper presents a computational model that integrates deductive parsing with e-graph-based library learning to jointly discover a library of harmonic patterns and concise program representations for a corpus of jazz chord progressions. Programs are built from a small set of primitive harmonic relations; the e-graph enables efficient sharing of substructure during search, and the resulting programs and libraries are evaluated for intuitiveness and similarity to human-written harmonic derivations.

Significance. If the empirical results hold, the work demonstrates a concrete, scalable application of program synthesis and library learning to symbolic music, yielding interpretable generative explanations that align with aspects of human musical intuition. The explicit enumeration of primitives and the use of e-graphs for joint optimization over programs and libraries are strengths that could inform future models of musical cognition.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of our work and for recommending minor revision. The assessment that our integration of deductive parsing with e-graph library learning yields interpretable generative explanations for jazz harmony is encouraging, as is the note on potential impact for models of musical cognition. No specific major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper describes an algorithmic search procedure that enumerates candidate programs over an explicitly listed set of primitive harmonic relations, then applies e-graph-based library learning and deductive parsing to jointly optimize a shared library and refactored programs for a given corpus. This is a concrete optimization process over an external discrete space rather than a closed-form derivation or prediction that reduces to fitted parameters by construction. No self-citations are invoked as load-bearing uniqueness theorems, no ansatz is smuggled in, and the reported similarities to human derivations are presented as post-hoc exploratory comparisons rather than as outputs forced by the model's own inputs. The central claim therefore remains independent of the results it produces.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; no explicit free parameters, axioms, or invented entities are stated. The model implicitly assumes that harmonic relations can be treated as composable primitives and that e-graph enumeration is sufficient to navigate the joint program-library space.

pith-pipeline@v0.9.0 · 5447 in / 1120 out tokens · 24138 ms · 2026-05-08T17:20:53.713169+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Prospective Compression in Human Abstraction Learning
cs.AI 2026-05 unverdicted novelty 7.0

Humans exhibit abstraction learning consistent with prospective compression of future tasks in non-stationary domains, unlike retrospective compression algorithms or LLM-based approaches.

Reference graph

Works this paper leans on

37 extracted references · cited by 1 Pith paper

[1]

Peter Boot, Anja Volk, and W Bas de Haas. 2016. Evaluating the role of repeated patterns in folk song classification and compression.Journal of New Music Research45, 3 (2016), 223–238

2016
[2]

Emilios Cambouropoulos. 2006. Musical parallelism and melodic segmentation:: A computational approach.Music Perception23, 3 (2006), 249–268

2006
[3]

David Cao, Rose Kunkel, Chandrakana Nandi, Max Willsey, Zachary Tatlock, and Nadia Polikarpova. 2023. babble: Learning better abstractions with e-graphs and anti-unification.Proceedings of the ACM on Programming Languages7, POPL (2023), 396–424

2023
[4]

David M Cerna and Temur Kutsia. 2023. Anti-unification and generalization: a survey. InProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence. 6563–6573

2023
[5]

Sudipta Chakrabarty, Ruhul Islam, Emil Pricop, and Hiren Kumar Deva Sarma
[6]

An approach to discover similar musical patterns.IEEE Access10 (2022), 47322–47339

2022
[7]

Nick Chater and Mike Oaksford. 2013. Programs as causal models: Speculations on mental programs and mental representation.Cognitive science37, 6 (2013), 1171–1191

2013
[8]

David Cope. 2002. Recombinant music: using the computer to explore musical style.Computer24, 7 (2002), 22–28

2002
[9]

W Bas de Haas, Anja Volk, and Frans Wiering. 2013. Structural segmentation of music based on repeated harmonies. In2013 IEEE International Symposium on Multimedia. IEEE, 255–258

2013
[10]

Pedro J Ponce De Leon and Jos M Inesta. 2007. Pattern recognition approach for music style identification using shallow statistical descriptors.IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)37, 2 (2007), 248–257

2007
[11]

Edward TR Hall and Marcus T Pearce. 2021. A model of large-scale thematic structure.Journal of New Music Research50, 3 (2021), 220–241

2021
[12]

2020.The learnability of the grammar of jazz: Bayesian inference of hierarchical structures in harmony

Daniel Harasim. 2020.The learnability of the grammar of jazz: Bayesian inference of hierarchical structures in harmony. Ph. D. Dissertation. EPFL

2020
[13]

Daniel Harasim, Christoph Finkensiep, Petter Ericson, Timothy J O’Donnell, and Martin Rohrmeier. 2020. The jazz harmony treebank. In21st ISMIR, Montréal, Canada, October 11-16, 2020. 207–215

2020
[14]

Yo-Wei Hsiao, Tzu-Yun Hung, Tsung-Ping Chen, and Li Su. 2023. BPS-Motif: A Dataset for Repeated Pattern Discovery of Polyphonic Symbolic Music.. InISMIR. 281–288

2023
[15]

Berit Janssen, W Bas De Haas, Anja Volk, and Peter Van Kranenburg. 2013. Find- ing repeated patterns in music: State of knowledge, challenges, perspectives. In International Symposium on Computer Music Multidisciplinary Research. Springer, 277–297

2013
[16]

Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenenbaum. 2015. Human- level concept learning through probabilistic program induction.Science350, 6266 (2015), 1332–1338

2015
[17]

Olivier Lartillot and Petri Toiviainen. 2007. Motivic matching strategies for automated pattern extraction.Musicae Scientiae11, 1_suppl (2007), 281–314

2007
[18]

Oriol Nieto, Gautham J Mysore, Cheng-i Wang, Jordan BL Smith, Jan Schlüter, Thomas Grill, and Brian McFee. 2020. Audio-based music structure analysis: Cur- rent trends, open challenges, and applications.Transactions of the International Society for Music Information Retrieval3, 1 (2020)

2020
[19]

Jouni Paulus and Anssi Klapuri. 2006. Music structure analysis by finding repeated parts. InProceedings of the 1st ACM workshop on Audio and music computing multimedia. 59–68

2006
[20]

Jouni Paulus, Meinard Müller, and Anssi Klapuri. 2010. State of the Art Report: Audio-Based Music Structure Analysis.. InIsmir. Utrecht, 625–636

2010
[21]

Marcus Pearce and Daniel Müllensiefen. 2017. Compression-based modelling of musical similarity perception.Journal of New Music Research46, 2 (2017), 135–155

2017
[22]

Fernando CN Pereira and David HD Warren. 1983. Parsing as deduction. In21st annual meeting of the association for computational linguistics. 137–144

1983
[23]

1970.Lattice theoretic properties of subsumption

Gordon Plotkin. 1970.Lattice theoretic properties of subsumption. Edinburgh University, Department of Machine Intelligence and Perception

1970
[24]

Martin Rohrmeier. 2020. The syntax of jazz harmony: Diatonic tonality, phrase structure, and form.Music Theory and Analysis (MTA)7, 1 (2020), 1–63

2020
[25]

Joe Cheri Ross, TP Vinutha, and Preeti Rao. 2012. Detecting Melodic Motifs from Audio for Hindustani Classical Music.. InISMIR. 193–198

2012
[26]

Gabriel Sargent, Frédéric Bimbot, and Emmanuel Vincent. 2011. A regularity- constrained Viterbi algorithm and its application to the structural segmentation of songs. InInternational Society for Music Information Retrieval Conference (ISMIR)

2011
[27]

Stuart M Shieber, Yves Schabes, and Fernando CN Pereira. 1995. Principles and implementation of deductive parsing.The Journal of logic programming24, 1-2 (1995), 3–36

1995
[28]

Mark Steedman. 1993. Categorial grammar.Lingua90, 3 (1993), 221–258

1993
[29]

Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. 2009. Equality saturation: a new approach to optimization. InProceedings of the 36th annual Library learning with e-graphs on jazz harmony Preprint, 2026, X ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 264– 276

2009
[30]

Joshua B Tenenbaum, Charles Kemp, Thomas L Griffiths, and Noah D Goodman
[31]

How to grow a mind: Statistics, structure, and abstraction.science331, 6022 (2011), 1279–1285

2011
[32]

Peter Van Kranenburg and Eric Backer. 2005. Musical style recognition—a quan- titative approach. InHandbook of pattern recognition and computer vision. World Scientific, 583–600

2005
[33]

Anja Volk and Peter Van Kranenburg. 2012. Melodic similarity among folk songs: An annotation study on similarity-based categorization in music.Musicae Scientiae16, 3 (2012), 317–339

2012
[34]

Max Willsey, Chandrakana Nandi, Yisu Remy Wang, Oliver Flatt, Zachary Tat- lock, and Pavel Panchekha. 2021. Egg: Fast and extensible equality saturation. Proceedings of the ACM on Programming Languages5, POPL (2021), 1–29

2021
[35]

Yihong Zhang, Yisu Remy Wang, Oliver Flatt, David Cao, Philip Zucker, Eli Rosen- thal, Zachary Tatlock, and Max Willsey. 2023. Better together: Unifying datalog and equality saturation.Proceedings of the ACM on Programming Languages7, PLDI (2023), 468–492

2023
[36]

Yu Zhang, Ziya Zhou, and Maosong Sun. 2022. Influence of musical elements on the perception of ‘Chinese style’in music.Cognitive Computation and Systems4, 2 (2022), 147–164

2022
[37]

Naomi Ziv and Zohar Eitan. 2007. Themes as prototypes: Similarity judgments and categorization tasks in musical contexts.Musicae Scientiae11, 1_suppl (2007), 99–133. A Encodings inegglog Listing 1Encoding theDescending5thrule inegglog. The rules consider chord qualities as equivalent up to the essential seventh- chord structure. For example, major seventh...

2007