Leveraging Morphology for Historical Script Metrological Analysis

Dominique Stutzmann; Malamatenia Vlachou Efstathiou; Mathieu Aubry; Rapha\"el Baena

arxiv: 2606.09446 · v1 · pith:CWQ7PHMWnew · submitted 2026-06-08 · 💻 cs.CV

Leveraging Morphology for Historical Script Metrological Analysis

Malamatenia Vlachou Efstathiou , Rapha\"el Baena , Dominique Stutzmann , Mathieu Aubry This is my paper

Pith reviewed 2026-06-27 17:15 UTC · model grok-4.3

classification 💻 cs.CV

keywords paleographyhistorical manuscriptscharacter prototypestransformer detectionmetrological analysisscript variationhandwritten text recognitionline-level supervision

0 comments

The pith

A transformer architecture learns character prototypes from line-level transcriptions to produce scalable paleographic measurements.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to establish that morphological script analysis, by learning character prototypes via a transformer-based detector and prototype reconstruction module, yields interpretable and stable measurements of characters, bigrams, and inter-unit spaces. A sympathetic reader would care because existing handwritten text recognition systems transcribe documents at scale yet leave paleographers without automatic visual metrics for studying script forms and scribe identities. The method requires only line-level transcriptions for supervision, improves bounding-box accuracy over the Learnable Typewriter baseline, and is shown on 160 annotated pages of a fourteenth-century codex copied by four hands. Measurements differentiate graphical profiles across hands and detect subtle intra-hand variations, with a single column of text sufficing per page.

Core claim

A transformer-based detection architecture together with a prototype-based line reconstruction module learns prototypical characters and their occurrence, deformation, and positioning from line-level transcriptions alone; this produces accurate character bounding boxes and enables automatic paleographic measurements of characters, bi-grams, and spaces that differentiate graphical profiles and reveal subtle variations across the four hands in the extended Paris, BnF, fr. 2813 codex.

What carries the argument

Prototype-based line reconstruction module that learns character prototypes and their deformations and positions from line transcriptions.

If this is right

Accurate character bounding boxes become available without character-level supervision.
Automatic measurements on characters, bi-grams, and spaces differentiate graphical profiles of different scribes.
Subtle variations within a single hand can be quantified and visualized across pages.
The approach scales to hundreds of pages while requiring only one column of text per page for training.
Paleographic analysis gains a frugal, scalable route from existing line transcriptions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same prototype learning could be applied to other manuscript collections to create comparable metrological databases across centuries or regions.
Integration into existing handwritten text recognition pipelines might allow metrological outputs with no extra annotation step.
The measurements could support quantitative tracking of script change over time within a single scribe's output.

Load-bearing premise

Character prototypes learned solely from line-level transcriptions produce bounding boxes and derived measurements that are both accurate and paleographically relevant without character-level supervision or additional validation.

What would settle it

Comparison of the model's predicted character bounding boxes against expert annotations on held-out pages of the codex, checking whether measurement-derived distinctions between the four known hands disappear or bounding-box error exceeds a small threshold.

Figures

Figures reproduced from arXiv: 2606.09446 by Dominique Stutzmann, Malamatenia Vlachou Efstathiou, Mathieu Aubry, Rapha\"el Baena.

**Figure 2.** Figure 2: Overview of the proposed approach. Building on DTLR, our backbone predicts bounding boxes, class probabilities, and color vectors from text line images. These outputs, alongside learnable prototypes, are used to render characters and compose them spatially onto a predicted background. We leverage bounding box dimensions to accurately affine-render each character. influenced by semiotics, grammar, and char… view at source ↗

**Figure 3.** Figure 3: Visualization types. necessary, does not change our qualitative results, and is simply added to better fit human intuition. Discarding errors. While the error rate of our network after the last fine-tuning stage is very low, it is still not zero, and it is important to discard errors before computing our metrics, as detailed in the next section. To do so, we consider both the ground truth and predicted tra… view at source ↗

**Figure 4.** Figure 4: Dataset Summary, with 160 units of analysis from Paris, BnF, fr. 2813. that is invariant to these variables. Inspired by P. Saenger [36] visual unit of space, defined as the distance between minims, we use as reference unit of space half the average width of the lowercase m (wm/2). Visualisation types. We use two types of visualization for these measures, which we refer to as linear graphs and crossed grap… view at source ↗

**Figure 5.** Figure 5: Extracted prototypes for Learnable Handwriter (LHW) and Ours of the 15 most frequent letters for the first page of the dataset (f. 1r). we only consider the outer columns of each page in our analysis. We also omit rubricated lines, which may have been added at a different stage, and truncated lines, for which layout constraints can influence spacing practices. This results in a corpus of 160 units of analy… view at source ↗

**Figure 6.** Figure 6: Linear Graphs. Highlighted regions show consecutive pages. Dotted lines indicate the beginnings and endings of gatherings. We report mean quantities over units of analysis (pages). shows linear graphs for selected letters’ aspect ratios, selected bigrams’ distances, and spaces between graphical units [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Crossed graphs for characters and bigrams. We cross information between the mean (µ) and coefficient of variation (CV) of different letters and bigrams aspect ratio (AR) and distances between bigrams characters (db). Letters proportionality and consistency. Figure 6a provides an overview of the evolution of letter aspect ratios across the dataset. The four Graphic Profiles (GPs) are already distinguishable… view at source ↗

read the original abstract

Advances in handwritten text recognition have enabled large-scale transcription of historical documents, but still provide limited access to interpretable visual measurements for paleography, the study of historical scripts. In this paper, our main insight is that morphological script analysis, in particular the capacity to learn character prototypes from line-level transcriptions, enables the definition of scalable, meaningful, and stable paleographic measurements. More precisely, we leverage a transformer-based detection architecture together with a prototype-based line reconstruction module to learn prototypical characters and their occurrence, deformation, and positioning. Our contributions are twofold. First, we introduce a deep architecture and learning methodology that enables efficient character modeling with only line-level transcription supervision, significantly improving over the Learnable Typewriter baseline and enabling accurate character bounding box prediction, unlocking its potential for paleographic measurements. Second, we introduce and demonstrate the paleographical relevance of automatic measurements enabled by our architecture for characters, bi-grams, and spaces between graphical units. For this demonstration, we extend the annotations of the codex Paris, BnF, fr. 2813, commissioned in the late fourteenth century by Charles V and copied by four hands, to 160 pages. We visualize our measurements over these pages, showing how they enable us not only to differentiate graphical profiles, but also to discover and analyze subtle variations. This case study outlines the scalability of our approach and its frugality in terms of required training data, since a single column of text is sufficient to compute our measurements on each of the 160 pages. Data and code are publicly available at: https://malamatenia.github.io/morphology4metrology-analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper extends Learnable Typewriter with a transformer detector and prototype module to produce paleographic measurements from line-level labels alone, but offers no quantitative checks on box accuracy or measurement stability.

read the letter

The main takeaway is that this work turns line transcriptions into character prototypes and then into measurements of size, bigrams, and spacing on a 14th-century codex. It improves detection over the cited baseline and shows the setup can run on modest training data—one column per page across 160 pages.

What stands out is the practical framing: they extended annotations on Paris BnF fr. 2813, released code and data, and focused on frugality for real manuscript collections. The visualizations suggest the outputs can separate scribal hands and flag small variations, which aligns with paleographic needs.

The soft spot is the missing validation. The claims of accurate bounding boxes and meaningful measurements rest on qualitative figures only. There are no IoU scores, error bars, or held-out character-level comparisons, so it is unclear whether the prototypes align well enough with actual instances for the measurements to be reliable. The stress-test note on unverified alignment matches what the abstract shows.

This is for digital humanities researchers who already work with handwritten text recognition pipelines and want to add metrological outputs. A reader with similar data might test the released code directly.

It deserves peer review. The architecture is concrete, the data public, and the goal practical; referees can ask for the quantitative checks that are currently absent.

Referee Report

2 major / 1 minor

Summary. The paper claims that a transformer-based detection architecture paired with a prototype-based line reconstruction module can learn character prototypes solely from line-level transcriptions, yielding accurate bounding-box predictions that enable scalable, stable paleographic measurements (for characters, bigrams, and inter-unit spaces). It reports improvement over the Learnable Typewriter baseline and demonstrates the measurements on 160 annotated pages of Paris, BnF fr. 2813, asserting that a single text column suffices for per-page analysis.

Significance. If the accuracy and paleographic relevance claims hold, the approach would supply a low-supervision route from HTR to quantitative metrology, allowing large-scale script profiling with minimal annotation. The public release of data and code is a clear reproducibility asset.

major comments (2)

[Abstract and §3] Abstract and §3 (method): the central claim that the architecture produces 'accurate character bounding box prediction' rests on line-level supervision alone, yet no quantitative metrics (IoU, precision@0.5, or box-level error) against character-level ground truth are reported; only qualitative visualizations on one codex are described.
[§5] §5 (case study): the assertion that the derived measurements are 'meaningful and stable' and enable differentiation of graphical profiles is supported only by visualizations; no statistical tests for measurement variance across hands/pages, no expert paleographer validation, and no comparison to manual metrology are provided.

minor comments (1)

[Abstract] The abstract states 'significantly improving over the Learnable Typewriter baseline' without citing the specific table or figure that quantifies the improvement.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the detailed and constructive report. The two major comments correctly identify that the manuscript relies primarily on qualitative evidence for its central claims. We address each point below and will revise the manuscript to qualify claims, add available quantitative elements where feasible, and clarify evaluation scope.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (method): the central claim that the architecture produces 'accurate character bounding box prediction' rests on line-level supervision alone, yet no quantitative metrics (IoU, precision@0.5, or box-level error) against character-level ground truth are reported; only qualitative visualizations on one codex are described.

Authors: The observation is correct: the manuscript reports no IoU, precision, or box-error metrics against character-level ground truth. The architecture is trained exclusively with line-level transcriptions, and the available annotations on the 160 pages consist of transcriptions rather than dense bounding-box ground truth. We will revise the abstract and §3 to remove the unqualified claim of 'accurate' prediction and instead describe the bounding boxes as an intermediate representation whose utility is demonstrated through the downstream metrological measurements. A short note on the absence of box-level metrics will be added. revision: yes
Referee: [§5] §5 (case study): the assertion that the derived measurements are 'meaningful and stable' and enable differentiation of graphical profiles is supported only by visualizations; no statistical tests for measurement variance across hands/pages, no expert paleographer validation, and no comparison to manual metrology are provided.

Authors: We agree that the case study currently offers only visual evidence. We will add simple statistical summaries (e.g., per-hand and per-page variance of character widths and inter-unit spaces) to §5. However, expert paleographer validation and direct comparison against manual metrology are not present in the current study and would require new annotations and collaboration outside the scope of this work. revision: partial

standing simulated objections not resolved

Expert paleographer validation of the automatic measurements and direct quantitative comparison against manual metrology on the same pages.

Circularity Check

0 steps flagged

No significant circularity; measurements are model outputs, not self-referential derivations.

full rationale

The paper presents a learned architecture (transformer detection + prototype reconstruction) trained on line-level transcriptions to produce character prototypes, bounding boxes, and derived metrological measurements. No equations, predictions, or uniqueness claims are shown that reduce these outputs to fitted inputs by construction, nor are there self-citations invoked as load-bearing premises for the central results. The measurements are explicitly described as direct outputs of the trained model applied to extended annotations on the Paris codex, with no renaming of known results or ansatz smuggling. The derivation chain is therefore self-contained against external benchmarks and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no equations, training details, or modeling assumptions; no free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.1-grok · 5843 in / 994 out tokens · 16648 ms · 2026-06-27T17:15:20.341706+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

53 extracted references · 1 canonical work pages

[1]

AIIA Notizie4, 34–38 (1999)

Aiolli, F., Simi, M., Sona, D., Sperduti, A., Starita, A., Zaccagnini, G.: Spi: A system for paleographic inspections. AIIA Notizie4, 34–38 (1999)

1999
[2]

IEEE Access9, 76674–76688 (2021)

Aradillas, J.C., Murillo-Fuentes, J.J., Olmos, P.M.: Boosting offline handwritten text recognition in historical documents with few labeled lines. IEEE Access9, 76674–76688 (2021)

2021
[3]

Advances in Neural Information Processing Systems37, 42388–42404 (2024)

Baena, R., Kalleli, S., Aubry, M.: General detection-based text line recognition. Advances in Neural Information Processing Systems37, 42388–42404 (2024)

2024
[4]

In: ACL Proceedings, vol.1

Berg-Kirkpatrick, T., Durrett, G., Klein, D.: Unsupervised transcription of historical documents. In: ACL Proceedings, vol.1. pp. 207–217 (2013)

2013
[5]

analyse sérielle de la densité de l’écriture dans les évangiles d’henri le lion

Bischoff, F.M.: Le rythme du scribe. analyse sérielle de la densité de l’écriture dans les évangiles d’henri le lion. Histoire & Mesure11(1/2), 53–91 (1996)

1996
[6]

In: ICDAR Proceedings

Bluche, T., Messina, R.: Gated convolutional recurrent neural networks for multi- lingual handwriting recognition. In: ICDAR Proceedings. pp. 646–651 (2017)

2017
[7]

In: 2016 12th IAPR DAS Workshop

Bluche, T., Stutzmann, D., Kermorvant, C.: Automatic handwritten character segmentation for paleographical character shape analysis. In: 2016 12th IAPR DAS Workshop. pp. 42–47. IEEE (2016)

2016
[8]

Digital Medievalist1(2005)

Ciula, A.: Digital palaeography: using the digital representation of medieval script to support palaeographic analysis. Digital Medievalist1(2005)

2005
[9]

Cambridge University Press (2003)

Derolez, A.: The palaeography of Gothic manuscript books: From the twelfth to the early sixteenth century. Cambridge University Press (2003)

2003
[10]

arXiv preprint arXiv:2104.07787 (2021) 16 Authors Suppressed Due to Excessive Length

Diaz, D.H., Qin, S., Ingle, R., Fujii, Y., Bissacco, A.: Rethinking text line recognition models. arXiv preprint arXiv:2104.07787 (2021) 16 Authors Suppressed Due to Excessive Length

arXiv 2021
[11]

In: Proceedings of the 2011 workshop on historical document imaging and processing

Fischer, A., Frinken, V., Fornés, A., Bunke, H.: Transcription alignment of latin manuscripts using hidden markov models. In: Proceedings of the 2011 workshop on historical document imaging and processing. pp. 29–36 (2011)

2011
[12]

ductus et rapport modulaire

Gilissen, L.: III. ductus et rapport modulaire. Scriptorium29(2), 235–244 (1975)

1975
[13]

In: Proceedings of the 23rd ICML

Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd ICML. pp. 369–376 (2006)

2006
[14]

Neural Networks18(5–6), 602–610 (2005)

Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Networks18(5–6), 602–610 (2005)

2005
[15]

Codicology and palaeography in the digital age pp

Gurrado, M.: «graphoskop», uno strumento informatico per l’analisi paleografica quantitativa. Codicology and palaeography in the digital age pp. 251–259 (2009)

2009
[16]

In: Mesure et histoire médiévale

Gurrado, M., Stutzmann, D.: Mesure et histoire des écritures médiévales. In: Mesure et histoire médiévale. XLIIIe Congrès de la SHMESP, Histoire ancienne et médiévale, vol. 123, pp. 153–166. PUPS (2013)

2013
[17]

The Art Bulletin66(1), 97–117 (1984)

Hedeman, A.D.: Valois legitimacy: Editorial changes in charles v’s grandes chroniques de france. The Art Bulletin66(1), 97–117 (1984)

1984
[18]

In: 2018 ICFHR Proceedings

Jenckel, M., Bukhari, S.S., Dengel, A.: Transcription free lstm ocr model evaluation. In: 2018 ICFHR Proceedings. pp. 122–126 (2018)

2018
[19]

Pattern Recognition129, 108766 (2022)

Kang, L., Riba, P., Rusiñol, M., Fornés, A., Villegas, M.: Pay attention to what you read: non-recurrent handwritten text-line recognition. Pattern Recognition129, 108766 (2022)

2022
[20]

In: Proceedings of the AAAI Conference on AI

Li, M., Lv, T., Chen, J., Cui, L., Lu, Y., Florencio, D., Zhang, C., Li, Z., Wei, F.: Trocr: Transformer-based optical character recognition with pre-trained models. In: Proceedings of the AAAI Conference on AI. vol. 37, pp. 13094–13102 (2023)

2023
[21]

Pattern Recognition158, 110967 (2025)

Li, Y., Chen, D., Tang, T., Shen, X.: Htr-vt: Handwritten text recognition with vision transformer. Pattern Recognition158, 110967 (2025)

2025
[22]

Heritage Science11, article 38 (2023)

Mamatsis, A.R., Mamatsi, E., Chalatsis, C., et al.: A novel methodology for writer (hand) identification: Establishing rigas feraios wrote two important greek documents discovered in romania. Heritage Science11, article 38 (2023)

2023
[23]

McGillivray, M.: Statistical analysis of digital paleographic data: what can it tell us? Digital Studies/Le champ numérique (11) (2005)

2005
[24]

In: ICDAR Proceedings

Michael, J., Labahn, R., Grüning, T., Zöllner, J.: Evaluating sequence-to-sequence models for handwritten text recognition. In: ICDAR Proceedings. pp. 1286–1293 (2019)

2019
[25]

In: Proceedings of the IEEE/CVF ICCV

Monnier, T., Vincent, E., Ponce, J., Aubry, M.: Unsupervised layered image de- composition into object prototypes. In: Proceedings of the IEEE/CVF ICCV. pp. 8640–8650 (2021)

2021
[26]

Gazette du livre médiéval56(1), 5–20 (2011)

Muzerelle, D.: À la recherche d’algorithmes experts en écritures médiévales. Gazette du livre médiéval56(1), 5–20 (2011)

2011
[27]

l’écriture entre histoire et science

Muzerelle, D., Gurrado, M.: Analyse d’images et paléographie systématique. l’écriture entre histoire et science. Gazette du livre médiéval56-57(2011)

2011
[28]

jahrhunderts und ihre schrift

Oeser, W.: Raoulet d’orléans und henri du trévou, zwei französische berufschreiber des 14. jahrhunderts und ihre schrift. Archiv für Diplomatik42, 395–418 (1996)

1996
[29]

statistique et paléographie: peut-on utiliser le rapport modulaire dans l’expertise des écritures médiévales? Scriptorium29(2), 198–234 (1975)

Ornato, E.: Ii. statistique et paléographie: peut-on utiliser le rapport modulaire dans l’expertise des écritures médiévales? Scriptorium29(2), 198–234 (1975)

1975
[30]

In: Ornato, E

Ornato, E.: L’histoire du livre et les méthodes quantitatives: bilan de vingt ans de recherche. In: Ornato, E. (ed.) La face cachée du livre médiéval. Viella, Roma (1997) Leveraging Morphology for Historical Script Metrological Analysis 17

1997
[31]

In: ICDAR Proceedings

Peng, D., Jin, L., Wu, Y., Wang, Z., Cai, M.: A fast and accurate fully convolutional network for end-to-end handwritten chinese text segmentation and recognition. In: ICDAR Proceedings. pp. 25–30 (2019)

2019
[32]

Doctoral dissertation, Università Ca’ Foscari Venezia (2014)

Peretto, M.: Raoulet d’Orléans e la copia di manoscritti a Parigi nella seconda metà del XIV secolo. Doctoral dissertation, Università Ca’ Foscari Venezia (2014)

2014
[33]

PloS one16(4), e0249769 (2021)

Popović, M., Dhali, M.A., Schomaker, L.: Artificial intelligence based writer iden- tification generates new evidence for the unknown scribes of the dead sea scrolls exemplified by the great isaiah scroll (1qisaa). PloS one16(4), e0249769 (2021)

2021
[34]

Bibliothèque de l’École des chartes132(1), 101–110 (1974)

Poulle, E.: Paléographie et méthodologie: vers l’analyse scientifique des écritures médiévales. Bibliothèque de l’École des chartes132(1), 101–110 (1974)

1974
[35]

Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwrit- ten text recognition? In: ICDAR Proceedings. pp. 67–72 (2017)

2017
[36]

Stanford University Press (1997)

Saenger, P.: Space between words: The origins of silent reading. Stanford University Press (1997)

1997
[37]

Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequencerecognitionanditsapplicationtoscenetextrecognition.IEEETransactions on Pattern Analysis and Machine Intelligence39(11), 2298–2304 (2016)

2016
[38]

In: ICDAR Proceedings

Siglidis, I., Gonthier, N., Gaubil, J., Monnier, T., Aubry, M.: The learnable type- writer: A generative approach to text analysis. In: ICDAR Proceedings. pp. 297—-
[39]

LNCC, Springer (2024)

2024
[40]

In: ICDAR Proceedings

Sinha, A., Jenckel, M., Bukhari, S.S., Dengel, A.: Unsupervised ocr model evaluation using gan. In: ICDAR Proceedings. pp. 1256–1261 (2019)

2019
[41]

Advances in Neural Information Processing Systems34, 5494–5505 (2021)

Smirnov, D., Gharbi, M., Fisher, M., Guizilini, V., Efros, A., Solomon, J.M.: Mari- onette: Self-supervised sprite learning. Advances in Neural Information Processing Systems34, 5494–5505 (2021)

2021
[42]

Les mutations du XIIe siècle et la datation des écritures par le profil scribal collectif

Stutzmann, D.: Conjuguer diplomatique, paléographie et édition électronique. Les mutations du XIIe siècle et la datation des écritures par le profil scribal collectif. In: Digital diplomatics. The computer as a tool for the diplomatist, pp. 271–290. No. 14 in Archiv für Diplomatik. Beiheft, Böhlau, Köln (2014)

2014
[43]

In: Words in the Middle Ages/Les Mots au Moyen Âge, pp

Stutzmann, D.: Words as graphic and linguistic structures: Word spacing in psalm 101 domine exaudi orationem meam (eleventh-fifteenth centuries). In: Words in the Middle Ages/Les Mots au Moyen Âge, pp. 21–59 (2020)

2020
[44]

In: DH2015 Global Digital Humanities (2015)

Stutzmann, D., Bluche, T., Lavrentiev, A., Leydier, Y., Kermorvant, C.: From text and image to historical resource: Text-image alignment for digital humanists. In: DH2015 Global Digital Humanities (2015)

2015
[45]

2813 - grandes chroniques de france (Apr 2025), https://doi.org/10.5281/zenodo.15282371

Vlachou Efstathiou, M.: Dataset for bnf, fr. 2813 - grandes chroniques de france (Apr 2025), https://doi.org/10.5281/zenodo.15282371

work page doi:10.5281/zenodo.15282371 2025
[46]

Scriptorium (forthcoming in 2026), https://malamatenia

Vlachou-Efstathiou, M.: Interpretable deep learning for palaeographic variability analysis; revisiting the scribal hands of Charles V’ grandes Chroniques de France (paris, bnf, fr., 2813). Scriptorium (forthcoming in 2026), https://malamatenia. github.io/bnf-fr-2813/

2026
[47]

In: ICDAR Proceed- ings

Vlachou-Efstathiou, M., Siglidis, I., Stutzmann, D., Aubry, M.: An interpretable deep learning approach for morphological script type analysis. In: ICDAR Proceed- ings. pp. 3–21. Springer (2024)

2024
[48]

In: 2016 ICFHR Proceedings

Voigtlaender, P., Doetsch, P., Ney, H.: Handwriting recognition with large multi- dimensional long short-term memory recurrent neural networks. In: 2016 ICFHR Proceedings. pp. 228–233 (2016)

2016
[49]

IEEE Transactions on Pattern Analysis and Machine Intelligence21(12), 1280–1296 (1999)

Xu, Y., Nagy, G.: Prototype extraction and adaptive ocr. IEEE Transactions on Pattern Analysis and Machine Intelligence21(12), 1280–1296 (1999)

1999
[50]

In: ICDAR Proceedings

Yin, F., Wang, Q.F., Zhang, X.Y., Liu, C.L.: Icdar 2013 chinese handwriting recognition competition. In: ICDAR Proceedings. pp. 1464–1470 (2013) 18 Authors Suppressed Due to Excessive Length

2013
[51]

Pattern Recognition p

Yu, M.M., Zhang, H., Yin, F., Liu, C.L.: An approach for handwritten chinese text recognition unifying character segmentation and recognition. Pattern Recognition p. 110373 (2024)

2024
[52]

Scrittura e civiltà 12, 135–176 (1988)

Zamponi, S.: Elisione e sovrapposizione nella littera textualis. Scrittura e civiltà 12, 135–176 (1988)

1988
[53]

arxiv preprint arXiv:2203.03605 (2022)

Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., Ni, L.M., Shum, H.Y.: Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arxiv preprint arXiv:2203.03605 (2022)

Pith/arXiv arXiv 2022

[1] [1]

AIIA Notizie4, 34–38 (1999)

Aiolli, F., Simi, M., Sona, D., Sperduti, A., Starita, A., Zaccagnini, G.: Spi: A system for paleographic inspections. AIIA Notizie4, 34–38 (1999)

1999

[2] [2]

IEEE Access9, 76674–76688 (2021)

Aradillas, J.C., Murillo-Fuentes, J.J., Olmos, P.M.: Boosting offline handwritten text recognition in historical documents with few labeled lines. IEEE Access9, 76674–76688 (2021)

2021

[3] [3]

Advances in Neural Information Processing Systems37, 42388–42404 (2024)

Baena, R., Kalleli, S., Aubry, M.: General detection-based text line recognition. Advances in Neural Information Processing Systems37, 42388–42404 (2024)

2024

[4] [4]

In: ACL Proceedings, vol.1

Berg-Kirkpatrick, T., Durrett, G., Klein, D.: Unsupervised transcription of historical documents. In: ACL Proceedings, vol.1. pp. 207–217 (2013)

2013

[5] [5]

analyse sérielle de la densité de l’écriture dans les évangiles d’henri le lion

Bischoff, F.M.: Le rythme du scribe. analyse sérielle de la densité de l’écriture dans les évangiles d’henri le lion. Histoire & Mesure11(1/2), 53–91 (1996)

1996

[6] [6]

In: ICDAR Proceedings

Bluche, T., Messina, R.: Gated convolutional recurrent neural networks for multi- lingual handwriting recognition. In: ICDAR Proceedings. pp. 646–651 (2017)

2017

[7] [7]

In: 2016 12th IAPR DAS Workshop

Bluche, T., Stutzmann, D., Kermorvant, C.: Automatic handwritten character segmentation for paleographical character shape analysis. In: 2016 12th IAPR DAS Workshop. pp. 42–47. IEEE (2016)

2016

[8] [8]

Digital Medievalist1(2005)

Ciula, A.: Digital palaeography: using the digital representation of medieval script to support palaeographic analysis. Digital Medievalist1(2005)

2005

[9] [9]

Cambridge University Press (2003)

Derolez, A.: The palaeography of Gothic manuscript books: From the twelfth to the early sixteenth century. Cambridge University Press (2003)

2003

[10] [10]

arXiv preprint arXiv:2104.07787 (2021) 16 Authors Suppressed Due to Excessive Length

Diaz, D.H., Qin, S., Ingle, R., Fujii, Y., Bissacco, A.: Rethinking text line recognition models. arXiv preprint arXiv:2104.07787 (2021) 16 Authors Suppressed Due to Excessive Length

arXiv 2021

[11] [11]

In: Proceedings of the 2011 workshop on historical document imaging and processing

Fischer, A., Frinken, V., Fornés, A., Bunke, H.: Transcription alignment of latin manuscripts using hidden markov models. In: Proceedings of the 2011 workshop on historical document imaging and processing. pp. 29–36 (2011)

2011

[12] [12]

ductus et rapport modulaire

Gilissen, L.: III. ductus et rapport modulaire. Scriptorium29(2), 235–244 (1975)

1975

[13] [13]

In: Proceedings of the 23rd ICML

Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd ICML. pp. 369–376 (2006)

2006

[14] [14]

Neural Networks18(5–6), 602–610 (2005)

Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Networks18(5–6), 602–610 (2005)

2005

[15] [15]

Codicology and palaeography in the digital age pp

Gurrado, M.: «graphoskop», uno strumento informatico per l’analisi paleografica quantitativa. Codicology and palaeography in the digital age pp. 251–259 (2009)

2009

[16] [16]

In: Mesure et histoire médiévale

Gurrado, M., Stutzmann, D.: Mesure et histoire des écritures médiévales. In: Mesure et histoire médiévale. XLIIIe Congrès de la SHMESP, Histoire ancienne et médiévale, vol. 123, pp. 153–166. PUPS (2013)

2013

[17] [17]

The Art Bulletin66(1), 97–117 (1984)

Hedeman, A.D.: Valois legitimacy: Editorial changes in charles v’s grandes chroniques de france. The Art Bulletin66(1), 97–117 (1984)

1984

[18] [18]

In: 2018 ICFHR Proceedings

Jenckel, M., Bukhari, S.S., Dengel, A.: Transcription free lstm ocr model evaluation. In: 2018 ICFHR Proceedings. pp. 122–126 (2018)

2018

[19] [19]

Pattern Recognition129, 108766 (2022)

Kang, L., Riba, P., Rusiñol, M., Fornés, A., Villegas, M.: Pay attention to what you read: non-recurrent handwritten text-line recognition. Pattern Recognition129, 108766 (2022)

2022

[20] [20]

In: Proceedings of the AAAI Conference on AI

Li, M., Lv, T., Chen, J., Cui, L., Lu, Y., Florencio, D., Zhang, C., Li, Z., Wei, F.: Trocr: Transformer-based optical character recognition with pre-trained models. In: Proceedings of the AAAI Conference on AI. vol. 37, pp. 13094–13102 (2023)

2023

[21] [21]

Pattern Recognition158, 110967 (2025)

Li, Y., Chen, D., Tang, T., Shen, X.: Htr-vt: Handwritten text recognition with vision transformer. Pattern Recognition158, 110967 (2025)

2025

[22] [22]

Heritage Science11, article 38 (2023)

Mamatsis, A.R., Mamatsi, E., Chalatsis, C., et al.: A novel methodology for writer (hand) identification: Establishing rigas feraios wrote two important greek documents discovered in romania. Heritage Science11, article 38 (2023)

2023

[23] [23]

McGillivray, M.: Statistical analysis of digital paleographic data: what can it tell us? Digital Studies/Le champ numérique (11) (2005)

2005

[24] [24]

In: ICDAR Proceedings

Michael, J., Labahn, R., Grüning, T., Zöllner, J.: Evaluating sequence-to-sequence models for handwritten text recognition. In: ICDAR Proceedings. pp. 1286–1293 (2019)

2019

[25] [25]

In: Proceedings of the IEEE/CVF ICCV

Monnier, T., Vincent, E., Ponce, J., Aubry, M.: Unsupervised layered image de- composition into object prototypes. In: Proceedings of the IEEE/CVF ICCV. pp. 8640–8650 (2021)

2021

[26] [26]

Gazette du livre médiéval56(1), 5–20 (2011)

Muzerelle, D.: À la recherche d’algorithmes experts en écritures médiévales. Gazette du livre médiéval56(1), 5–20 (2011)

2011

[27] [27]

l’écriture entre histoire et science

Muzerelle, D., Gurrado, M.: Analyse d’images et paléographie systématique. l’écriture entre histoire et science. Gazette du livre médiéval56-57(2011)

2011

[28] [28]

jahrhunderts und ihre schrift

Oeser, W.: Raoulet d’orléans und henri du trévou, zwei französische berufschreiber des 14. jahrhunderts und ihre schrift. Archiv für Diplomatik42, 395–418 (1996)

1996

[29] [29]

statistique et paléographie: peut-on utiliser le rapport modulaire dans l’expertise des écritures médiévales? Scriptorium29(2), 198–234 (1975)

Ornato, E.: Ii. statistique et paléographie: peut-on utiliser le rapport modulaire dans l’expertise des écritures médiévales? Scriptorium29(2), 198–234 (1975)

1975

[30] [30]

In: Ornato, E

Ornato, E.: L’histoire du livre et les méthodes quantitatives: bilan de vingt ans de recherche. In: Ornato, E. (ed.) La face cachée du livre médiéval. Viella, Roma (1997) Leveraging Morphology for Historical Script Metrological Analysis 17

1997

[31] [31]

In: ICDAR Proceedings

Peng, D., Jin, L., Wu, Y., Wang, Z., Cai, M.: A fast and accurate fully convolutional network for end-to-end handwritten chinese text segmentation and recognition. In: ICDAR Proceedings. pp. 25–30 (2019)

2019

[32] [32]

Doctoral dissertation, Università Ca’ Foscari Venezia (2014)

Peretto, M.: Raoulet d’Orléans e la copia di manoscritti a Parigi nella seconda metà del XIV secolo. Doctoral dissertation, Università Ca’ Foscari Venezia (2014)

2014

[33] [33]

PloS one16(4), e0249769 (2021)

Popović, M., Dhali, M.A., Schomaker, L.: Artificial intelligence based writer iden- tification generates new evidence for the unknown scribes of the dead sea scrolls exemplified by the great isaiah scroll (1qisaa). PloS one16(4), e0249769 (2021)

2021

[34] [34]

Bibliothèque de l’École des chartes132(1), 101–110 (1974)

Poulle, E.: Paléographie et méthodologie: vers l’analyse scientifique des écritures médiévales. Bibliothèque de l’École des chartes132(1), 101–110 (1974)

1974

[35] [35]

Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwrit- ten text recognition? In: ICDAR Proceedings. pp. 67–72 (2017)

2017

[36] [36]

Stanford University Press (1997)

Saenger, P.: Space between words: The origins of silent reading. Stanford University Press (1997)

1997

[37] [37]

Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequencerecognitionanditsapplicationtoscenetextrecognition.IEEETransactions on Pattern Analysis and Machine Intelligence39(11), 2298–2304 (2016)

2016

[38] [38]

In: ICDAR Proceedings

Siglidis, I., Gonthier, N., Gaubil, J., Monnier, T., Aubry, M.: The learnable type- writer: A generative approach to text analysis. In: ICDAR Proceedings. pp. 297—-

[39] [39]

LNCC, Springer (2024)

2024

[40] [40]

In: ICDAR Proceedings

Sinha, A., Jenckel, M., Bukhari, S.S., Dengel, A.: Unsupervised ocr model evaluation using gan. In: ICDAR Proceedings. pp. 1256–1261 (2019)

2019

[41] [41]

Advances in Neural Information Processing Systems34, 5494–5505 (2021)

Smirnov, D., Gharbi, M., Fisher, M., Guizilini, V., Efros, A., Solomon, J.M.: Mari- onette: Self-supervised sprite learning. Advances in Neural Information Processing Systems34, 5494–5505 (2021)

2021

[42] [42]

Les mutations du XIIe siècle et la datation des écritures par le profil scribal collectif

Stutzmann, D.: Conjuguer diplomatique, paléographie et édition électronique. Les mutations du XIIe siècle et la datation des écritures par le profil scribal collectif. In: Digital diplomatics. The computer as a tool for the diplomatist, pp. 271–290. No. 14 in Archiv für Diplomatik. Beiheft, Böhlau, Köln (2014)

2014

[43] [43]

In: Words in the Middle Ages/Les Mots au Moyen Âge, pp

Stutzmann, D.: Words as graphic and linguistic structures: Word spacing in psalm 101 domine exaudi orationem meam (eleventh-fifteenth centuries). In: Words in the Middle Ages/Les Mots au Moyen Âge, pp. 21–59 (2020)

2020

[44] [44]

In: DH2015 Global Digital Humanities (2015)

Stutzmann, D., Bluche, T., Lavrentiev, A., Leydier, Y., Kermorvant, C.: From text and image to historical resource: Text-image alignment for digital humanists. In: DH2015 Global Digital Humanities (2015)

2015

[45] [45]

2813 - grandes chroniques de france (Apr 2025), https://doi.org/10.5281/zenodo.15282371

Vlachou Efstathiou, M.: Dataset for bnf, fr. 2813 - grandes chroniques de france (Apr 2025), https://doi.org/10.5281/zenodo.15282371

work page doi:10.5281/zenodo.15282371 2025

[46] [46]

Scriptorium (forthcoming in 2026), https://malamatenia

Vlachou-Efstathiou, M.: Interpretable deep learning for palaeographic variability analysis; revisiting the scribal hands of Charles V’ grandes Chroniques de France (paris, bnf, fr., 2813). Scriptorium (forthcoming in 2026), https://malamatenia. github.io/bnf-fr-2813/

2026

[47] [47]

In: ICDAR Proceed- ings

Vlachou-Efstathiou, M., Siglidis, I., Stutzmann, D., Aubry, M.: An interpretable deep learning approach for morphological script type analysis. In: ICDAR Proceed- ings. pp. 3–21. Springer (2024)

2024

[48] [48]

In: 2016 ICFHR Proceedings

Voigtlaender, P., Doetsch, P., Ney, H.: Handwriting recognition with large multi- dimensional long short-term memory recurrent neural networks. In: 2016 ICFHR Proceedings. pp. 228–233 (2016)

2016

[49] [49]

IEEE Transactions on Pattern Analysis and Machine Intelligence21(12), 1280–1296 (1999)

Xu, Y., Nagy, G.: Prototype extraction and adaptive ocr. IEEE Transactions on Pattern Analysis and Machine Intelligence21(12), 1280–1296 (1999)

1999

[50] [50]

In: ICDAR Proceedings

Yin, F., Wang, Q.F., Zhang, X.Y., Liu, C.L.: Icdar 2013 chinese handwriting recognition competition. In: ICDAR Proceedings. pp. 1464–1470 (2013) 18 Authors Suppressed Due to Excessive Length

2013

[51] [51]

Pattern Recognition p

Yu, M.M., Zhang, H., Yin, F., Liu, C.L.: An approach for handwritten chinese text recognition unifying character segmentation and recognition. Pattern Recognition p. 110373 (2024)

2024

[52] [52]

Scrittura e civiltà 12, 135–176 (1988)

Zamponi, S.: Elisione e sovrapposizione nella littera textualis. Scrittura e civiltà 12, 135–176 (1988)

1988

[53] [53]

arxiv preprint arXiv:2203.03605 (2022)

Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., Ni, L.M., Shum, H.Y.: Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arxiv preprint arXiv:2203.03605 (2022)

Pith/arXiv arXiv 2022