New Sinhala OCR dataset from 1981-2019 legislative acts enables LightOnOCR-2-1B to reach 1.05% CER, beating Surya-OCR, Tesseract, and Google Document AI.
What if we only use real datasets for scene text recognition? toward scene text recognition with fewer labels,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Cross-Temporal Sinhala OCR: Page-Level Adaptation and Diachronic Analysis
New Sinhala OCR dataset from 1981-2019 legislative acts enables LightOnOCR-2-1B to reach 1.05% CER, beating Surya-OCR, Tesseract, and Google Document AI.