pith. machine review for the scientific record. sign in

arxiv: 2605.06627 · v1 · submitted 2026-05-07 · 💻 cs.SD · cs.LG

Recognition: unknown

PianoCoRe: Combined and Refined Piano MIDI Dataset

Authors on Pith no claims yet

Pith reviewed 2026-05-08 04:05 UTC · model grok-4.3

classification 💻 cs.SD cs.LG
keywords piano MIDIscore alignmentexpressive performancemusic datasetMIDI qualityalignment refinementperformance modeling
0
0 comments X

The pith

A combined and refined piano MIDI dataset provides the largest collection of score-aligned performances to date and improves model robustness on new pieces.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper combines and cleans multiple existing piano MIDI corpora to create PianoCoRe, a dataset with over 250,000 performances of more than 5,000 pieces by 483 composers. It offers tiered versions, including a note-aligned subset with 157,207 performances matched to 1,591 scores. This addresses limitations in prior resources like narrow composer coverage, missing alignments, and inconsistent formats. A quality classifier and alignment refinement pipeline called RAScoP help remove corrupted files and fix errors. Training an expressive performance model on this data leads to better results on unseen music compared to smaller or unrefined sets.

Core claim

PianoCoRe unifies major open-source piano MIDI datasets into a large-scale resource with 250,046 performances totaling 21,763 hours. The note-aligned PianoCoRe-A subset is the largest open collection of its kind. A MIDI quality classifier detects corrupted files, and the RAScoP pipeline refines alignments by cleaning temporal errors and interpolating missing notes. Refinement reduces temporal noise and eliminates tempo outliers. An expressive rendering model trained on PianoCoRe shows improved robustness to unseen pieces.

What carries the argument

The PianoCoRe dataset with its tiered subsets and the RAScoP alignment refinement pipeline, which cleans temporal alignment errors and interpolates missing notes while the quality classifier removes corrupted files.

If this is right

  • Models for expressive piano performance can be trained on larger, cleaner aligned data, leading to better generalization.
  • The refinement process reduces temporal noise and removes tempo outliers from the data.
  • Researchers gain access to subsets suited for pre-training, large-scale analysis, or precise alignment tasks.
  • Future work in music information retrieval can build on this unified resource instead of fragmented smaller datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Such a dataset could support development of more accurate automatic accompaniment systems or piano transcription tools.
  • Extending similar refinement pipelines to other instruments might create comparable resources for broader music research.
  • Improved robustness suggests that data quality and alignment matter more than sheer volume alone for performance modeling tasks.

Load-bearing premise

The MIDI quality classifier and RAScoP pipeline correctly identify corrupted files and fix alignment errors without introducing systematic biases or removing valid expressive variations.

What would settle it

If a performance rendering model trained on PianoCoRe fails to show improved robustness on a held-out set of unseen pieces compared to models trained on raw data, or if manual review reveals that many valid performances were incorrectly removed or alignments distorted.

Figures

Figures reproduced from arXiv: 2605.06627 by Ilya Borovik.

Figure 1
Figure 1. Figure 1: The three-stage data matching and annotation pipeline used to create PianoCoRe dataset. scores from KunstderFuge10 and ClassicalMIDI11 web￾sites were used solely for enriching the representation of annotated performed compositions in PianoCoRe. The copyrighted scores are not redistributed in the fi￾nal dataset. Since KunstderFuge provides live perfor￾mance and orchestral MIDI files, inexpressive solo pi￾an… view at source ↗
Figure 2
Figure 2. Figure 2: Statistical overview of the PianoCoRe-C dataset for the 50 most represented composers. Top: The total number of unique pieces per composer (blue) and the number of pieces with a musical score (light blue). Bottom: The average number of performances per piece, accumulated by the MIDI source. 1 2 3 4 5 6 7 8 9 10 12 13 17 18 25 26 50 51 100 101 200 201 500 501+ # of performances 0 200 400 600 800 # of pieces… view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of the number of musical pieces by the number of performances in PianoCoRe-C. Note that PianoCoRe-C is not deduplicated or fil￾tered for quality. This raw, comprehensive collec￾tion serves as the foundation for the refined subsets, PianoCoRe-B and PianoCoRe-A, detailed next. 3.6.1 Content and Metadata All score and performance files are organized under the composer/composition/movement/ direct… view at source ↗
Figure 4
Figure 4. Figure 4: MIDI performances from ASAP (orange) and ATEPP (blue) grouped by original labels and mapped as a function of performance-to-score note ratio Rn and adjusted alignment ratio R ′ a . 1. Score (S): deadpan score MIDI performances; 2. High Quality (HQ): any recorded MIDI, tran￾scribed MIDI with R ′ a > 0.9; 3. Low Quality (LQ): transcribed, 0.7 < R ′ a < 0.85; 4. Corrupted (C): transcribed, R ′ a < 0.65. The q… view at source ↗
Figure 5
Figure 5. Figure 5: Real-world alignment challenges motivating the RAScoP pipeline. Top: local timing errors (crossed links) and missing/extra notes. Bottom: large structural deviation from a missing score segment, causing incorrect links. Other performed notes remain usable. Alignments were computed with Parangonar. • Alignment Holes: Continuous regions of un￾aligned notes in the score or performance, often caused by skipped… view at source ↗
Figure 6
Figure 6. Figure 6: Note-level alignment and the RAScoP pipeline for alignment refinement. The processing steps are demonstrated using an artificial example containing all types of errors. Score notes are drawn in black and performance notes are drawn in blue and green. For inter-onset intervals, the method estimates the maximum and minimum plausible time shifts ∆tmax(oi) and ∆tmin(oi) between the current and previous score o… view at source ↗
Figure 7
Figure 7. Figure 7: Applying the full pipeline (H+O) signifi￾cantly reduces the standard deviation of inter-onset de￾viations within chords, indicating cleaner note timing patterns. Furthermore, the distribution of beat tempos becomes more stable and centered around a musically plausible range, as the algorithm corrects for the ex￾treme tempo values implied by raw, noisy alignments view at source ↗
Figure 8
Figure 8. Figure 8: Validation loss curves for PianoFlow trained on different subsets of the data. Larger and refined training datasets reduce overfitting in the long run. ASAP ATEPP PERiScoPe Aria-MIDI Dataset Size Vel TS TD Vel TS TD Vel TS TD Vel TS TD ASAP 1k 9.885 0.023 0.187 9.928 0.022 0.206 9.893 0.023 0.230 9.957 0.027 0.275 + ATEPP 6k 9.157 0.017 0.168 8.230 0.015 0.191 8.782 0.016 0.216 8.721 0.019 0.252 + PERiScoP… view at source ↗
read the original abstract

Symbolic music datasets with matched scores and performances are essential for many music information retrieval (MIR) tasks. Yet, existing resources often cover a narrow range of composers, lack performance variety, omit note-level alignments, or use inconsistent naming formats. This work presents PianoCoRe, a large-scale piano MIDI dataset that unifies and refines major open-source piano corpora. The dataset contains 250,046 performances of 5,625 pieces written by 483 composers, totaling 21,763 h of performed music. PianoCoRe is released in tiered subsets to support different applications: from large-scale analysis and pre-training (PianoCoRe-C and deduplicated PianoCoRe-B) to expressive performance modeling with note-level score alignment (PianoCoRe-A/A*). The note-aligned subset, PianoCoRe-A, provides the largest open-source collection of 157,207 performances aligned to 1,591 scores to date. In addition to the dataset, the contributions are: (1) a MIDI quality classifier for detecting corrupted and score-like transcriptions and (2) RAScoP, an alignment refinement pipeline that cleans temporal alignment errors and interpolates missing notes. The analysis shows that the refinement reduces temporal noise and eliminates tempo outliers. Moreover, an expressive performance rendering model trained on PianoCoRe demonstrates improved robustness to unseen pieces compared to models trained on raw or smaller datasets. PianoCoRe provides a ready-to-use foundation for the next generation of expressive piano performance research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The paper presents PianoCoRe, a unified large-scale piano MIDI dataset aggregating and refining existing open-source corpora into 250,046 performances of 5,625 pieces by 483 composers (21,763 hours total). It releases tiered subsets, with PianoCoRe-A providing the largest open note-aligned collection (157,207 performances aligned to 1,591 scores). Contributions include a MIDI quality classifier to remove corrupted/score-like files and the RAScoP pipeline for correcting temporal alignments and interpolating notes. The work claims that refinement reduces temporal noise and tempo outliers, and that an expressive performance rendering model trained on PianoCoRe shows improved robustness to unseen pieces compared to models trained on raw or smaller datasets.

Significance. If the validation claims hold, PianoCoRe would be a substantial resource for MIR tasks and expressive performance modeling, offering greater scale, alignment, and cleanliness than prior collections while supporting different use cases via its tiers. The open release and practical focus on note-level alignment could enable more reproducible and robust research in symbolic music.

major comments (3)
  1. [Abstract and §6] Abstract and §6 (analysis of refinement): the claim that 'the refinement reduces temporal noise and eliminates tempo outliers' is presented without any quantitative metrics (e.g., before/after distributions of timing variance, tempo statistics, or error rates), baselines, or comparison to ground-truth alignments, which is load-bearing for asserting improved data quality.
  2. [§4 and §5] §4 (MIDI quality classifier) and §5 (RAScoP pipeline): no external validation, inter-annotator agreement, or ablation is reported to confirm that the classifier and pipeline correctly discard only corrupted files and fix alignments without systematically removing valid expressive variations or introducing biases in timing/dynamics; this directly affects the central claim that PianoCoRe-A is the largest high-quality aligned collection.
  3. [§7] §7 (expressive performance rendering experiments): the improved robustness to unseen pieces is asserted but without details on model architecture, training hyperparameters, exact evaluation metrics (e.g., note-onset F1, velocity/timing error), the specific unseen test set, or an ablation comparing the same model trained on raw data with only trivial cleaning, making it impossible to attribute gains to the refinement versus scale alone.
minor comments (3)
  1. [§2 and Table 1] Clarify the exact differences and intended use cases between PianoCoRe-A and PianoCoRe-A* in the main text and any summary tables.
  2. [References and §3] Add explicit DOIs or persistent links for all source corpora in the references and dataset description to improve reproducibility.
  3. [Figures in §5] Ensure all figures showing before/after alignment examples include scale bars or quantitative annotations for visual clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that additional quantitative evidence and experimental details are needed to strengthen the claims and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract and §6] Abstract and §6 (analysis of refinement): the claim that 'the refinement reduces temporal noise and eliminates tempo outliers' is presented without any quantitative metrics (e.g., before/after distributions of timing variance, tempo statistics, or error rates), baselines, or comparison to ground-truth alignments, which is load-bearing for asserting improved data quality.

    Authors: We acknowledge that the current manuscript presents the refinement effects qualitatively without supporting statistics. In the revised version we will add explicit before/after quantitative metrics in §6, including distributions of timing variance, tempo outlier counts, and any available alignment error rates relative to ground-truth scores where they exist. These will also be summarized in the abstract. revision: yes

  2. Referee: [§4 and §5] §4 (MIDI quality classifier) and §5 (RAScoP pipeline): no external validation, inter-annotator agreement, or ablation is reported to confirm that the classifier and pipeline correctly discard only corrupted files and fix alignments without systematically removing valid expressive variations or introducing biases in timing/dynamics; this directly affects the central claim that PianoCoRe-A is the largest high-quality aligned collection.

    Authors: We agree that external validation and ablations are important. We will add an ablation study in the revised manuscript quantifying the effect of the classifier and RAScoP on dataset size and quality metrics, plus a description of our internal validation procedure (manual sampling and held-out checks). We will clarify that inter-annotator agreement is not applicable here as the process is automated with post-hoc inspection rather than multi-annotator labeling. revision: yes

  3. Referee: [§7] §7 (expressive performance rendering experiments): the improved robustness to unseen pieces is asserted but without details on model architecture, training hyperparameters, exact evaluation metrics (e.g., note-onset F1, velocity/timing error), the specific unseen test set, or an ablation comparing the same model trained on raw data with only trivial cleaning, making it impossible to attribute gains to the refinement versus scale alone.

    Authors: We will expand §7 with complete details on model architecture, training hyperparameters, exact evaluation metrics (including note-onset F1 and timing/velocity errors), and the composition of the unseen test set. We will also include an ablation comparing the identical model trained on the raw versus refined data to isolate the contribution of refinement from scale. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset curation with no derivation chain or fitted predictions

full rationale

The paper describes aggregation of existing MIDI corpora, application of a quality classifier, and a heuristic alignment pipeline (RAScoP) followed by empirical checks on temporal noise and model robustness. No equations, first-principles derivations, or predictions are presented that could reduce to inputs by construction. Claims about scale and improved robustness rest on data processing and external comparisons rather than self-referential definitions or self-citation load-bearing arguments. This is standard data-resource work with no load-bearing mathematical steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are introduced; the work relies on existing public MIDI corpora and standard alignment techniques.

pith-pipeline@v0.9.0 · 5561 in / 948 out tokens · 57981 ms · 2026-05-08T04:05:39.406021+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 2 canonical work pages

  1. [1]

    Transactions of the International Society for Music Information Retrieval , doi =

    Lerch, Alexander and Arthur, Claire and Pati, Ashis and Gururani, Siddharth , year =. Transactions of the International Society for Music Information Retrieval , doi =

  2. [2]

    2018 , journal =

    Cancino-Chac. 2018 , journal =

  3. [3]

    Proceedings of the 36th International Conference on Machine Learning

    Jeong, Dasaem and Kwon, Taegyun and Kim, Yoojin and Nam, Juhan , year =. Proceedings of the 36th International Conference on Machine Learning

  4. [4]

    Proceedings of the 20th International Society for Music Information Retrieval Conference

    Dasaem Jeong and Taegyun Kwon and Yoojin Kim and Kyogu Lee and Juhan Nam , year =. Proceedings of the 20th International Society for Music Information Retrieval Conference

  5. [5]

    Proceedings of the 23rd International Society for Music Information Retrieval Conference

    Rhyu, Seungyeon and Kim, Sarah and Lee, Kyogu , year =. Proceedings of the 23rd International Society for Music Information Retrieval Conference

  6. [6]

    Proceedings of the 28th ACM International Conference on Multimedia , pages =

    Huang, Yu-Siang and Yang, Yi-Hsuan , year =. Proceedings of the 28th ACM International Conference on Multimedia , pages =

  7. [7]

    Proceedings of the 35th AAAI Conference on Artificial Intelligence , volume =

    Hsiao, Wen-Yi and Liu, Jen-Yu and Yeh, Yin-Cheng and Yang, Yi-Hsuan , year =. Proceedings of the 35th AAAI Conference on Artificial Intelligence , volume =

  8. [8]

    Proceedings of the 21st International Society for Music Information Retrieval Conference

    Foscarin, Francesco and Mcleod, Andrew and Rigaux, Philippe and Jacquemard, Florent and Sakai, Masahiko , year =. Proceedings of the 21st International Society for Music Information Retrieval Conference

  9. [9]

    2022 , booktitle =

    Zhang, Huan and Tang, Jingjing and Rafee, Syed Rifat Mahmud and Fazekas, Simon Dixon Gy. 2022 , booktitle =

  10. [10]

    Proceedings of the 18th International Society for Music Information Retrieval Conference

    Eita Nakamura and Kazuyoshi Yoshii and Haruhiro Katayose , year =. Proceedings of the 18th International Society for Music Information Retrieval Conference

  11. [11]

    Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 , pages =

    Zeng, Mingliang and Tan, Xu and Wang, Rui and Ju, Zeqian and Qin, Tao and Liu, Tie-Yan , year =. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 , pages =

  12. [12]

    Xia, Gus Guangyu , year =

  13. [13]

    Open Engineering , publisher =

    Zacharov, Igor and Arslanov, Rinat and Gunin, Maksim and Stefonishin, Daniil and Bykov, Andrey and Pavlov, Sergey and Panarin, Oleg and Maliutin, Anton and Rykovanov, Sergey and Fedorov, Maxim , year =. Open Engineering , publisher =

  14. [14]

    Proceedings of the 24th International Society for Music Information Retrieval Conference

    Borovik, Ilya and Viro, Vladimir , year =. Proceedings of the 24th International Society for Music Information Retrieval Conference

  15. [15]

    2023 , journal =

    Peter, Silvan David and Cancino-Chac. 2023 , journal =

  16. [16]

    Proceedings of the 24th International Society for Music Information Retrieval Conference

    Peter, Silvan David , year =. Proceedings of the 24th International Society for Music Information Retrieval Conference

  17. [17]

    2024 , journal =

    Zhang, Huan and Chowdhury, Shreyan and Cancino-Chac. 2024 , journal =

  18. [18]

    Transactions of the International Society for Music Information Retrieval , volume =

    Kong, Qiuqiang and Li, Bochen and Chen, Jitong and Wang, Yuxuan , year =. Transactions of the International Society for Music Information Retrieval , volume =

  19. [19]

    Improving pronunciation and accent conversion through knowledge distillation and synthetic ground-truth from native tts

    Long, Phillip and Novack, Zachary and Berg-Kirkpatrick, Taylor and McAuley, Julian , year =. Proceeding of the 50th IEEE International Conference on Acoustics, Speech and Signal Processing. doi:10.1109/ICASSP49660.2025.10890217 , organization =

  20. [20]

    Proceedings of the 25th International Society for Music Information Retrieval Conference

    Yujia Yan and Zhiyao Duan , year =. Proceedings of the 25th International Society for Music Information Retrieval Conference

  21. [21]

    Proceedings of the 25th International Society for Music Information Retrieval Conference

    Tim Beyer and Angela Dai , year =. Proceedings of the 25th International Society for Music Information Retrieval Conference

  22. [22]

    Proceedings of the 24th International Society for Music Information Retrieval Conference

    Hu, Patricia and Widmer, Gerhard , year =. Proceedings of the 24th International Society for Music Information Retrieval Conference

  23. [23]

    Extended Abstracts for the Late-Breaking Demo Session of the 25th International Society for Music Information Retrieval Conference

    Yikai Liao and Zhongqi Luo and Yue Wang and Yujie Yin , year =. Extended Abstracts for the Late-Breaking Demo Session of the 25th International Society for Music Information Retrieval Conference

  24. [24]

    Proceeding of the 50th IEEE International Conference on Acoustics, Speech and Signal Processing

    Tang, Jingjing and Cooper, Erica and Wang, Xin and Yamagishi, Junichi and Fazekas, György , year =. Proceeding of the 50th IEEE International Conference on Acoustics, Speech and Signal Processing

  25. [25]

    Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC , pages =

    Lam, Siu Kwan and Pitrou, Antoine and Seibert, Stanley , year =. Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC , pages =

  26. [26]

    2022 , booktitle =

    Cancino-Chac. 2022 , booktitle =

  27. [27]

    Journal of the Musical Arts in Africa , publisher =

    Watson, Maike , year =. Journal of the Musical Arts in Africa , publisher =

  28. [28]

    2011 , booktitle =

    M. 2011 , booktitle =

  29. [29]

    Proceedings of the 4th International Conference on Technologies for Music Notation and Representation

    Kosta, Katerina and Bandtlow, Oscar F and Chew, Elaine , year =. Proceedings of the 4th International Conference on Technologies for Music Notation and Representation

  30. [30]

    Proceedings of the 15th Sound and Music Computing Conference

    Hashida, Mitsuyo and Nakamura, Eita and Katayose, Haruhiro , year =. Proceedings of the 15th Sound and Music Computing Conference

  31. [31]

    Proceeding of the 20th International Society on Music Information Retrieval

    Shi, Zhengshan and Sapp, Craig and Arul, Kumaran and McBride, Jerry and Smith III, Julius O , year =. Proceeding of the 20th International Society on Music Information Retrieval

  32. [32]

    Proceedings of the 7th International Conference on Representation Learning

    Curtis Hawthorne and Andriy Stasyuk and Adam Roberts and Ian Simon and Cheng-Zhi Anna Huang and Sander Dieleman and Erich Elsen and Jesse Engel and Douglas Eck , year =. Proceedings of the 7th International Conference on Representation Learning

  33. [33]

    Goebl, Werner , year =

  34. [34]

    IEEE/ACM Transactions on Audio, Speech, and Language Processing , publisher =

    Kong, Qiuqiang and Li, Bochen and Song, Xuchen and Wan, Yuan and Wang, Yuxuan , year =. IEEE/ACM Transactions on Audio, Speech, and Language Processing , publisher =

  35. [35]

    The Virtual Score: Representation, Retrieval, Restoration , publisher =

    Good, Michael , year =. The Virtual Score: Representation, Retrieval, Restoration , publisher =

  36. [36]

    Neurocomputing , publisher =

    Su, Jianlin and Ahmed, Murtadha and Lu, Yu and Pan, Sh engfeng and Bo, Wen and Liu, Yunfeng , year =. Neurocomputing , publisher =

  37. [37]

    Proceedings of the 22nd International Society for Music Information Retrieval Conference

    Hung, Hsiao-Tzu and Ching, Joann and Doh, Seungheon and Kim, Nabin and Nam, Juhan and Yang, Yi-Hsuan , year =. Proceedings of the 22nd International Society for Music Information Retrieval Conference

  38. [38]

    Proceedings of the 13th International Conference on Representation Learning

    Bradshaw, Louis and Colton, Simon , year =. Proceedings of the 13th International Conference on Representation Learning

  39. [39]

    Proceedings of the 26th International Society for Music Information Retrieval Conference

    Bradshaw, Louis and Fan, Honglu and Spangher, Alexander and Biderman, Stella and Colton, Simon , year =. Proceedings of the 26th International Society for Music Information Retrieval Conference

  40. [40]

    Proceedings of the 33rd ACM International Conference on Multimedia , pages =

    Borovik, Ilya and Gavrilev, Dmitrii and Viro, Vladimir , year =. Proceedings of the 33rd ACM International Conference on Multimedia , pages =

  41. [41]

    2024 , booktitle =

    Hu, Patricia and Mart. 2024 , booktitle =

  42. [42]

    Transactions of the International Society for Music Information Retrieval , volume =

    Ycart, Adrien and Liu, Lele and Benetos, Emmanouil and Pearce, Marcus , year =. Transactions of the International Society for Music Information Retrieval , volume =

  43. [43]

    Multimedia Tools and Applications , publisher =

    Simonetta, Federico and Avanzini, Federico and Ntalampiras, Stavros , year =. Multimedia Tools and Applications , publisher =

  44. [44]

    Transactions of the International Society for Music Information Retrieval , volume =

    Lee, Keon Ju Maverick and Ens, Jeff and Adkins, Sara and Sarmento, Pedro and Barthet, Mathieu and Pasquier, Philippe , year =. Transactions of the International Society for Music Information Retrieval , volume =

  45. [45]

    Transactions of the International Society for Music Information Retrieval , volume =

    Edwards, Drew and Dixon, Simon and Benetos, Emmanouil , year =. Transactions of the International Society for Music Information Retrieval , volume =

  46. [46]

    Journal of Creative Music Systems , publisher =

    Chou, Yi-Hui and Chen, I-Chun and Ching, Joann and Chang, Chin-Jui and Yang, Yi-Hsuan , year =. Journal of Creative Music Systems , publisher =

  47. [47]

    IEEE Signal Processing Magazine , publisher =

    Benetos, Emmanouil and Dixon, Simon and Duan, Zhiyao and Ewert, Sebastian , year =. IEEE Signal Processing Magazine , publisher =

  48. [48]

    IEEE Signal Processing Letters , publisher =

    Edwards, Drew and Dixon, Simon and Benetos, Emmanouil and Maezawa, Akira and Kusaka, Yuta , year =. IEEE Signal Processing Letters , publisher =

  49. [49]

    Proceedings of the 37th AAAI Conference on Artificial Intelligence , volume =

    Guo, Zixun and Kang, Jaeyong and Herremans, Dorien , year =. Proceedings of the 37th AAAI Conference on Artificial Intelligence , volume =

  50. [50]

    Proceeding of the 25th IEEE International Conference on Multimedia and Expo

    Liang, Xiao and Zhao, Zijian and Zeng, Weichao and He, Yutong and He, Fupeng and Wang, Yiyi and Gao, Chengying , year =. Proceeding of the 25th IEEE International Conference on Multimedia and Expo. doi:10.1109/ICME57554.2024.10688332 , organization =

  51. [51]

    Advances in Neural Information Processing Systems

    Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, ukasz and Polosukhin, Illia , year =. Advances in Neural Information Processing Systems

  52. [52]

    The Proceedings of the 11th International Conference on Learning Representations

    Lipman, Yaron and Chen, Ricky TQ and Ben-Hamu, Heli and Nickel, Maximilian and Le, Matthew , year =. The Proceedings of the 11th International Conference on Learning Representations

  53. [53]

    Emerson, Katelyn and Harrison, Peter M. C. , year =. Transactions of the International Society for Music Information Retrieval , volume =