Sheet mu- sic transformer++: End-to-end full-page optical music recognition for pianoform sheet music

· 2024 · arXiv 2405.12105

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio

cs.SD · 2025-05-19 · conditional · novelty 6.0

A unified Transformer model with modality-specific tokenization, trained on a new 1300-hour multimodal music dataset, outperforms single-task baselines on optical music recognition and other translations while achieving the first score-image-conditioned audio generation.

A High-Accuracy Optical Music Recognition Method Based on Bottleneck Residual Convolutions

cs.CV · 2026-04-07 · unverdicted · novelty 3.0

A CNN using ResNet-v2-style residual bottleneck blocks and multi-scale dilated convolutions followed by BiGRU and CTC loss achieves SeER of 7.52% and SyER of 0.45% on the Camera-PrIMuS dataset for optical music recognition.

Direct content-based retrieval from music scores images

cs.CV · 2026-05-21

citing papers explorer

Showing 3 of 3 citing papers.

Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio cs.SD · 2025-05-19 · conditional · none · ref 22
A unified Transformer model with modality-specific tokenization, trained on a new 1300-hour multimodal music dataset, outperforms single-task baselines on optical music recognition and other translations while achieving the first score-image-conditioned audio generation.
A High-Accuracy Optical Music Recognition Method Based on Bottleneck Residual Convolutions cs.CV · 2026-04-07 · unverdicted · none · ref 35
A CNN using ResNet-v2-style residual bottleneck blocks and multi-scale dilated convolutions followed by BiGRU and CTC loss achieves SeER of 7.52% and SyER of 0.45% on the Camera-PrIMuS dataset for optical music recognition.
Direct content-based retrieval from music scores images cs.CV · 2026-05-21 · unreviewed · ref 45

Sheet mu- sic transformer++: End-to-end full-page optical music recognition for pianoform sheet music

fields

years

verdicts

representative citing papers

citing papers explorer