Overview of HIPE-2026: Person-Place Relation Extraction from Multilingual Historical Texts
Pith reviewed 2026-06-25 20:25 UTC · model grok-4.3
The pith
HIPE-2026 shows that systems extract temporally grounded person-place relations from noisy historical texts but must trade accuracy against efficiency and domain robustness.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The HIPE-2026 campaign confronted participants with two temporally grounded relations—at for prior presence and isAt for presence at the document date—and found that both large language models and lightweight task-specific classifiers can be applied, yet they exhibit clear trade-offs among accuracy, speed, and robustness when processing multilingual historical material at scale.
What carries the argument
The three-fold evaluation framework that scores systems on accuracy, efficiency, and generalization to a surprise literary domain while distinguishing prior presence (at) from contemporaneous presence (isAt).
If this is right
- Lightweight classifiers remain competitive when processing millions of pages where speed matters more than marginal accuracy gains.
- Cross-domain testing on literary texts reveals whether newspaper-trained models transfer to other historical genres.
- Explicit handling of OCR noise and historical spelling variation is required for any system intended for real cultural-heritage corpora.
- The two-relation temporal distinction forces models to reason about document publication dates rather than treating all mentions as static facts.
Where Pith is reading between the lines
- Combining the relation extraction step with the named-entity linking from earlier HIPE editions could produce end-to-end timelines of individual movements.
- Adding more languages or additional relation types would test whether the observed trade-offs persist beyond the current setup.
- Efficiency metrics could be used to rank systems for deployment on very large archives where annotation cost is the practical bottleneck.
Load-bearing premise
The three languages, two relation types, and two document domains are enough to test robustness and generalization for historical relation extraction.
What would settle it
If every high-performing system used the same underlying approach and showed no measurable difference in efficiency or domain-shift performance, the claimed diversity of viable strategies and inherent trade-offs would not hold.
Figures
read the original abstract
Was this person ever at that place, and if so, when? Answering such questions from noisy, multilingual historical documents is the central challenge of HIPE-2026, the third edition of the HIPE evaluation series. Moving from named entity recognition and linking (HIPE-2020, HIPE-2022) to reasoning about relationships between entities, HIPE-2026 targets two temporally grounded relation types: $at$, indicating that a person was present at a location at some point prior to a document's publication date, and $isAt$, indicating presence contemporaneous with that date. This paper presents the results of the evaluation campaign, which confronted 17 participating teams with the challenges of historical language variation, OCR noise, and indirect contextual cues across three languages: French, German, and English. The datasets include historical newspaper text from the nineteenth and twentieth centuries, as well as a surprise-domain generalization set drawn from early modern French literary texts. A distinctive feature of HIPE-2026 is its three-fold evaluation framework, which assesses predictive accuracy, computational efficiency, and cross-domain generalization, reflecting the practical demands of large-scale historical document processing in the cultural heritage domain. Across more than 40 submitted runs, results reveal a wide range of strategies, from state-of-the-art large language models to lightweight task-specific classifiers, and highlight the trade-offs between accuracy, efficiency, and robustness inherent to historical relation extraction at corpus scale. System descriptions, datasets, and findings are presented and discussed, offering a detailed picture of the current state of temporally grounded relation extraction for historical documents.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents an overview of the HIPE-2026 shared task on extracting two temporally grounded person-place relations (at and isAt) from noisy multilingual historical texts in French, German, and English. It describes the task setup, datasets (19th-20th century newspapers plus a surprise early-modern French literary domain), participation by 17 teams with over 40 runs, and a three-fold evaluation framework covering predictive accuracy, computational efficiency, and cross-domain generalization. Results are summarized to illustrate a spectrum of approaches from state-of-the-art LLMs to lightweight classifiers along with observed trade-offs in accuracy, efficiency, and robustness.
Significance. If the reported outcomes hold, the paper is significant for documenting current capabilities and practical trade-offs in historical relation extraction, a task central to large-scale cultural-heritage document processing. The three-fold evaluation framework is a clear strength, as it directly addresses real-world constraints of accuracy, efficiency, and generalization rather than accuracy alone.
minor comments (2)
- [Abstract] Abstract: the notation $at$ and $isAt$ is introduced without an explicit definition or example; a brief inline gloss would improve immediate readability for readers outside the shared-task community.
- The manuscript would benefit from a consolidated table (perhaps in the results section) that reports the top systems on all three evaluation dimensions side-by-side, rather than scattering the metrics across separate discussions.
Simulated Author's Rebuttal
We thank the referee for their positive summary, recognition of the significance of the three-fold evaluation framework, and recommendation for minor revision. No major comments were provided in the report.
Circularity Check
No significant circularity; factual report on external evaluation campaign
full rationale
The paper is an overview of the HIPE-2026 shared task results. It describes the task setup, datasets (newspapers and literary texts in French/German/English), evaluation framework (accuracy, efficiency, generalization), and summarizes participant outcomes across >40 runs. No equations, derivations, fitted parameters, or predictions are present. Claims about strategies and trade-offs follow directly from running the described campaign; no self-definitional reductions, load-bearing self-citations, or ansatzes exist. This is a standard descriptive shared-task report with no derivation chain to inspect.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A.,MacAvaney,S.,Struß,J.M.(eds.)CLEF2026WorkingNotes,CEURWorkshop Proceedings
Aboelwafa, Y., Samir, A., Elmakky, N., Torki, M.: DistilledGemma: Balanced Efficiency-Accuracy for Person-Place Relation Extraction from Multilingual His- torical Articles. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A.,MacAvaney,S.,Struß,J.M.(eds.)CLEF2026WorkingNotes,CEURWorkshop Proceedings. CEUR-WS (2026)
2026
-
[2]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Almanza González, D., Martinez-Santos, J.C., Puertas, E.: A Multitask Approach Based on RoBERTa, Temporal Transformer, and Person-Location Relation Ex- traction in Multilingual Historical Text. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[3]
Les corpus annotés du français
Blumenthal, P., Diwersy, S., Falaise, A., Lay, M.H., Sourvay, G., Vigier, D.: Presto, un corpus diachronique pour le français des XVIe-XXe siècles. In: atelier “ Les corpus annotés du français ”. Actes de TALN 2017, Orléans, France (Jun 2017), https://shs.hal.science/halshs-01585010
2017
-
[4]
Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language Models
Boylan, J., Hokamp, C., Ghalandari, D.G.: GLiREL - generalist model for zero- shot relation extraction. In: Chiruzzo, L., Ritter, A., Wang, L. (eds.) Proceed- ings of the 2025 Conference of the Nations of the Americas Chapter of the As- sociation for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). pp. 8230–8245. Association...
-
[5]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Cao, H., Han, Z., Duan, X.: Few-Shot Prompting with Large Language Models for Multilingual Person–Place Relation Extraction from Historical Documents. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[6]
Knowledge- Based Systems194, 105508 (2020)
Cardoso, S.D., Da Silveira, M., Pruski, C.: Construction and exploitation of an historical knowledge graph to deal with the evolution of ontologies. Knowledge- Based Systems194, 105508 (2020)
2020
-
[7]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Checchin, T., Guille, A., Guteherlé, N.: Some LLMs Are Smaller Than Others: A Lightweight Linear Model for Relation Extraction Applied to Historical Doc- uments. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[8]
Unsupervised Cross-lingual Representation Learning at Scale
Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., Grave, E., Ott, M., Zettlemoyer, L., Stoyanov, V.: Unsupervised cross-lingual representation learning at scale. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J. (eds.) Proceedings of the 58th Annual Meeting of the Association for Compu- tational Linguistics. pp. 8440–845...
-
[9]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Dick, A.K., Hermes, J., Reiter, N.: Scaffolding Large and Small Reasoning Models for Person-Place Relation Extraction. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[10]
Ehrmann, M., Romanello, M., Flückiger, A., Clematide, S.: Overview of CLEF HIPE 2020: Named Entity Recognition and Linking on Historical Newspapers. In: Arampatzis, A., Kanoulas, E., Tsikrika, T., Vrochidis, S., Joho, H., Lioma, C., CLEF-HIPE-2026 23 Eickhoff, C., Névéol, A., Cappellato, L., Ferro, N. (eds.) Experimental IR Meets Multilinguality, Multimod...
-
[11]
Ehrmann, M., Romanello, M., Najem-Meyer, S., Doucet, A., Clematide, S.: Overview of HIPE-2022: Named Entity Recognition and Linking in Multilin- gual Historical Documents. In: Experimental IR Meets Multilinguality, Multi- modality, and Interaction: 13th International Conference of the CLEF Associa- tion, CLEF 2022, Bologna, Italy, September 5–8, 2022, Pro...
2022
-
[12]
In: LREC
Fokkens, A., Ter Braake, S., Ockeloen, N., Vossen, P., Legêne, S., Schreiber, G., et al.: BiographyNet: Methodological issues when NLP supports historical research. In: LREC. pp. 3728–3735 (2014)
2014
-
[13]
https://doi.org/10.5281/zenodo.6481300
Gabay, S., Clérice, T., Gille Levenson, M., Camps, J.B., Tanguy, J.B.: Freem- corpora/freemlpm: Freem lpm (lemma, pos- tags, morphology) corpus (2022). https://doi.org/10.5281/zenodo.6481300
-
[14]
In: Estève, Y., Jiménez, T., Parcollet, T., Zanon Boito, M
Gabay, S., Ortiz Suarez, P., Bawden, R., Bartz, A., Gambette, P., Sagot, B.: Le projet FREEM : ressources, outils et enjeux pour l’étude du français d’ancien régime (the F RE EM project: Resources, tools and challenges for the study of ancien régime French). In: Estève, Y., Jiménez, T., Parcollet, T., Zanon Boito, M. (eds.) Actes de la 29e Conférence sur ...
2022
-
[15]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Gomez-Navalon, J., Bernal-Beltrán, T., Pan, R., García-Díaz, J.A., Valencia- García, R.: UMUTeam at HIPE-CLEF 2026: Person-Place Relation Extraction from Multilingual Historical Documents. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[16]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Her- rera, A., MacAvaney, S., Struß, J.M
Griesbeck, L., Hennen, M., Babl, F., Geierhos, M.: FI-CODE@HIPE-2026: Effi- cient Person-Place Relation Classification from Distance Heuristics to Prompted Language Models. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Her- rera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[17]
Artificial intelligence63(1-2), 69–142 (1993)
Hobbs, J.R., Stickel, M.E., Appelt, D.E., Martin, P.: Interpretation as abduction. Artificial intelligence63(1-2), 69–142 (1993)
1993
-
[18]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Keinan, R., Tsarfaty, R.: Where Were They? Person-Place Relation Extraction in Multilingual Historical Archives. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[19]
Liu, A.H., Khandelwal, K., Subramanian, S., Jouault, V., Rastogi, A., et al.: Minis- tral 3 (2026),https://arxiv.org/abs/2601.08584
Pith/arXiv arXiv 2026
-
[20]
EPJ Data Science8(7) (2019).https://doi.org/ 10.1140/epjds/s13688-019-0215-7
Lucchini, L., Sinatra, R., Emery, C., Panzarasa, P., Servedio, V.D.P., Riccaboni, M., Cattuto, C.: Following the footsteps of giants: Modeling the mobility of his- torically notable individuals. EPJ Data Science8(7) (2019).https://doi.org/ 10.1140/epjds/s13688-019-0215-7
-
[21]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Morillo, A., Puertas, E., Martinez-Santos, J.C.: Person-Place Relation Extraction in Historical Texts: A Dual Approach with Fine-Tuned NLI Encoders and SLMs at HIPE-2026. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026) 24 J. Opitz et al
2026
-
[22]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A.,MacAvaney,S.,Struß,J.M.(eds.)CLEF2026WorkingNotes,CEURWorkshop Proceedings
Nguyen, S., Zheng, R., Nurbakova, D., Barrere, K.: INSA Lyon at HIPE 2026: A Modular Ensemble Pipeline for Person–Place Relation Classification in Historical Newspapers. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A.,MacAvaney,S.,Struß,J.M.(eds.)CLEF2026WorkingNotes,CEURWorkshop Proceedings. CEUR-WS (2026)
2026
-
[23]
Opitz, J.: A closer look at classification evaluation metrics and a critical reflec- tion of common evaluation practice. Transactions of the Association for Computa- tional Linguistics12, 820–836 (Jun 2024),https://doi.org/10.1162/tacl_ a_00675
-
[24]
In: Proceedings of the 3rd International Con- ference on Digital Access to Textual Cultural Heritage
Opitz, J., Born, L., Nastase, V., Pultar, Y.: Automatic reconstruction of emperor itineraries from the regesta imperii. In: Proceedings of the 3rd International Con- ference on Digital Access to Textual Cultural Heritage. pp. 39–44 (2019)
2019
-
[25]
Opitz, J., Ehrmann, M., Clematide, S., Corina, R., Boros, E., Michail, A., Ro- manello, M.: CLEF HIPE-2026 - Shared Task Participation Guidelines (May 2026).https://doi.org/10.5281/zenodo.20082076,https://doi.org/ 10.5281/zenodo.20082076
-
[26]
In: Campos, R., Jatowt, A., Lan, Y., Aliannejadi, M., Bauer, C., MacAvaney, S., Anand, A., Ren, Z., Verberne, S., Bai, N., Mansoury, M
Opitz, J., Raclé, C., Boros, E., Michail, A., Romanello, M., Ehrmann, M., Clematide, S.: Clef hipe-2026: Evaluating accurate and efficient person–place re- lation extraction from multilingual historical texts. In: Campos, R., Jatowt, A., Lan, Y., Aliannejadi, M., Bauer, C., MacAvaney, S., Anand, A., Ren, Z., Verberne, S., Bai, N., Mansoury, M. (eds.) Adva...
2026
-
[27]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Opitz, J., Raclé, C., Michail, A., Romanello, M., Boros, E., Gabay, S., Ehrmann, M., Clematide, S.: Extended Overview of HIPE-2026: Evaluating Accurate and Efficient Person–Place Relation Extraction from Multilingual Historical Texts. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working...
-
[28]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Pham, N.H., Moreno, J.G., Doucet, A.: MILRIT at HIPE 2026: From Generalist Relation Extraction to Multi-View Learning for Historical Person-Place Relations. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[29]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Ram, R., Soren, S.: Surgical Fine-Tuning and Cost-Sensitive Learning for Low- ResourceHistoricalRelationExtraction:JadavpurUniversityatCLEF-HIPE2026. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[30]
Science345(6196), 558–562 (2014)
Schich, M., Song, C., Ahn, Y.Y., Mirsky, A., Martino, M., Barabási, A.L., Helbing, D.: A network framework of cultural history. Science345(6196), 558–562 (2014). https://doi.org/10.1126/science.1240064
-
[31]
In: Proceedings of the 2015 International Conference on the Theory of Information Retrieval
Sebastiani, F.: An axiomatically derived measure for the evaluation of classification algorithms. In: Proceedings of the 2015 International Conference on the Theory of Information Retrieval. pp. 11–20 (2015)
2015
-
[32]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Her- rera, A., MacAvaney, S., Struß, J.M
Shekhawat, P.S., Gupta, R.: From Frozen Encoders to Multi-Agent LLM Pipelines: An Iterative Journey through Person–Place Relation Extraction in Multilingual Historical Texts. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Her- rera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026) CLEF-HIP...
2026
-
[33]
In: Proceedings of the Digital Humanities Conference (DH 2023)
Tamper, M., Kettunen, M., Mäkelä, E., Ruotsalo, T., Leskinen, P., Hyvönen, E.: BiographySampo: A linked open data service for prosopographical biography re- search. In: Proceedings of the Digital Humanities Conference (DH 2023). Graz, Austria (2023),https://seco.cs.aalto.fi/projects/biographysampo/
2023
-
[34]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Vasile, D.M., Apostol, E.S., Truâ, C.O.: Awakened at hipe-2026: Knowledge- augmented prompting and logic-constrained fine-tuning for person–place relation extraction. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[35]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Wesley, M.T.: Lightweight Person-Place Relation Extraction from Historical News- papers with Dependency Graphs and Proximity Features. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[36]
arXiv preprint arXiv:2505.09388 (2025)
Yang, A., Li, A., Yang, B., Zhang, B., Hui, B., Zheng, B., Yu, B., Gao, C., Huang, C., Lv, C., et al.: Qwen3 technical report. arXiv preprint arXiv:2505.09388 (2025)
Pith/arXiv arXiv 2025
-
[37]
arXiv preprint arXiv:2412.15115 (2024)
Yang,A.,Yang,B.,Zhang,B.,Hui,B.,Zheng,B.,Yu,B.,Li,C.,Liu,D.,Huang,F., Wei, H., et al.: Qwen2.5 technical report. arXiv preprint arXiv:2412.15115 (2024)
Pith/arXiv arXiv 2024
-
[38]
In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M
Yazdani, S., Wahle, J.P., Gipp, B.: GippLab at HIPE 2026: Unlocking Person– Place Relation Extraction with the Qwen Model Family. In: Sánchez Salido, E., Barrón-Cedeño, A., García Seco de Herrera, A., MacAvaney, S., Struß, J.M. (eds.) CLEF 2026 Working Notes, CEUR Workshop Proceedings. CEUR-WS (2026)
2026
-
[39]
ACM Computing Surveys56(4), 1–62 (2023)
Zhong, L., Wu, J., Li, Q., Peng, H., Wu, X.: A comprehensive survey on automatic knowledge graph construction. ACM Computing Surveys56(4), 1–62 (2023)
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.