Recognition: 2 theorem links
· Lean TheoremMatching Meaning at Scale: Evaluating Semantic Search for 18th-Century Intellectual History through the Case of Locke
Pith reviewed 2026-05-13 07:44 UTC · model grok-4.3
The pith
Semantic search finds substantially more implicit receptions of Locke's ideas than lexical matching in 18th-century texts, though surface vocabulary still shapes results.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using expert annotation grounded in a semantic taxonomy, the authors examine whether an off-the-shelf semantic search pipeline can surface meaning-level correspondences overlooked by lexical methods. Results show semantic search retrieves substantially more implicit receptions than lexical baselines. Linguistic diagnostics reveal a lexical gatekeeping effect in which retrieval remains partially constrained by surface vocabulary overlap.
What carries the argument
Off-the-shelf semantic search pipeline evaluated against expert-annotated semantic taxonomy for implicit receptions of Locke.
If this is right
- Historians gain access to paraphrased and complex implicit engagements with ideas that verbatim detection misses.
- Large-scale tracing of idea circulation becomes feasible beyond direct quotations.
- Retrieval performance stays influenced by lexical overlap, limiting full independence from surface forms.
- Combining semantic and lexical approaches can improve coverage of intellectual transmission.
Where Pith is reading between the lines
- Training embeddings on period-specific corpora could weaken the lexical gatekeeping effect observed here.
- The same evaluation setup could be applied to receptions of other key authors or in adjacent centuries.
- Historians might test hybrid retrieval pipelines that weight semantic and lexical signals differently for different research questions.
Load-bearing premise
Semantic embeddings trained mainly on modern text can reliably detect 18th-century meaning-level matches when measured against expert semantic annotations.
What would settle it
Expert re-annotation of a larger sample showing no meaningful increase in implicit receptions recovered by semantic search compared with lexical search would undermine the reported advantage.
Figures
read the original abstract
While digitized corpora have transformed the study of intellectual transmission, current methods rely heavily on lexical text reuse detection, capturing verbatim quotations but fundamentally missing paraphrases and complex implicit engagement. This paper evaluates semantic search in 18th-century intellectual history through the reception of John Locke's foundational work. Using expert annotation grounded in a semantic taxonomy, we examine whether an off-the-shelf semantic search pipeline can surface meaning-level correspondences overlooked by lexical methods. Our results demonstrate that semantic search retrieves substantially more implicit receptions than lexical baselines. However, linguistic diagnostics also reveal a "lexical gatekeeping" effect, where retrieval remains partially constrained by surface vocabulary overlap. These findings highlight both the potential and the limitations of semantic retrieval for analyzing the circulation of ideas in large historical corpora. The data is available at https://github.com/COMHIS/locke-sim-data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper evaluates an off-the-shelf semantic search pipeline against lexical baselines for detecting implicit (non-verbatim) receptions of John Locke's works in 18th-century texts. Using expert annotations grounded in a semantic taxonomy, it claims that semantic search surfaces substantially more implicit receptions than lexical methods, while also documenting a 'lexical gatekeeping' effect in which retrieval remains partially constrained by surface-form vocabulary overlap. The data and annotations are released publicly.
Significance. If the quantitative results and embedding fidelity hold after the requested clarifications, the work would be significant for digital intellectual history and computational humanities. It supplies a concrete, reproducible case study that quantifies the gap between lexical reuse detection and meaning-level retrieval, while isolating a diagnostic limitation ('lexical gatekeeping') that future methods must address. The public release of the annotated dataset further strengthens its utility for the community.
major comments (2)
- [Abstract] Abstract: the headline claim that semantic search 'retrieves substantially more implicit receptions' is presented without any reported sample sizes, inter-annotator agreement figures, or statistical tests, rendering the magnitude and reliability of the improvement unverifiable from the given text.
- [Evaluation protocol] Evaluation protocol (implicit in the abstract and methods description): the central claim that off-the-shelf modern embeddings capture 18th-century meaning-level correspondences rests on expert annotations, yet the manuscript provides no direct validation of embedding fidelity to period usage (e.g., sense disambiguation accuracy on 18th-century vocabulary or comparison against embeddings trained on contemporary corpora).
minor comments (2)
- [Methods] The semantic taxonomy used for annotation is referenced but not described in sufficient detail (number of categories, inter-category distinctions, or examples of implicit vs. explicit reception).
- [Data availability] The GitHub repository link is given, but the manuscript should include a brief data statement summarizing the number of annotated pairs, annotation guidelines, and any preprocessing steps applied to the historical corpus.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed report. We address each major comment below and indicate the revisions we will make to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the headline claim that semantic search 'retrieves substantially more implicit receptions' is presented without any reported sample sizes, inter-annotator agreement figures, or statistical tests, rendering the magnitude and reliability of the improvement unverifiable from the given text.
Authors: We agree that the abstract should report these quantitative details to make the central claim immediately verifiable. In the revised version we will expand the abstract to state the total number of expert-annotated passages, the inter-annotator agreement (Cohen’s kappa), and the statistical test used to compare semantic versus lexical retrieval rates. These figures are already present in the main text and will now appear in the abstract as well. revision: yes
-
Referee: [Evaluation protocol] Evaluation protocol (implicit in the abstract and methods description): the central claim that off-the-shelf modern embeddings capture 18th-century meaning-level correspondences rests on expert annotations, yet the manuscript provides no direct validation of embedding fidelity to period usage (e.g., sense disambiguation accuracy on 18th-century vocabulary or comparison against embeddings trained on contemporary corpora).
Authors: The expert annotations, performed by specialists in 18th-century intellectual history and guided by an explicit semantic taxonomy, constitute our primary empirical validation that retrieved passages reflect meaning-level engagement with Locke. We did not, however, conduct separate sense-disambiguation accuracy tests on period vocabulary or train and compare against 18th-century-specific embeddings. In the revision we will add a new subsection in Methods that (a) justifies the use of off-the-shelf embeddings for reproducibility and (b) explicitly acknowledges the absence of these additional fidelity metrics as a limitation, while outlining how future work could address it. This clarification will be added without new experiments. revision: partial
Circularity Check
No significant circularity; evaluation uses external expert annotations and standard lexical baselines
full rationale
The paper evaluates an off-the-shelf semantic search pipeline against expert annotations grounded in a semantic taxonomy and compares results directly to lexical baselines. No parameters are fitted to the evaluation data and then presented as predictions. No self-citations or prior author work are invoked as load-bearing uniqueness theorems or ansatzes. The reported 'lexical gatekeeping' effect is diagnosed via linguistic diagnostics on the retrieved outputs rather than assumed by construction. The derivation chain remains self-contained against external benchmarks with no reduction of claimed results to inputs by definition or self-reference.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Semantic embeddings capture meaning-level correspondences beyond lexical overlap in 18th-century English
- domain assumption Expert annotations grounded in the semantic taxonomy provide valid ground truth
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We adopt a deliberately minimal dense retrieval-based search pipeline... encoded using the paraphrase-multilingual-mpnet-base-v2 model... Efficient vector indexing and retrieval were executed via FAISS
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
heuristic annotation taxonomy... Lexical Matches, Paraphrase Matches, Meaning Matches, Topical Matches
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Motasem Alrahabi and Tom Wainstain. 2025. Versus: An automatic text comparison tool for the digital humanities. In Proceedings of the First on Natural Language Processing and Language Models for Digital Humanities , pages 32--37, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria
work page 2025
-
[2]
David Armitage and Jo Guldi. 2014. The History Manifesto . Cambridge University Press
work page 2014
-
[3]
Emmanuelle Berm \`e s. 2017. Text, data and link-mining in digital libraries: Looking for the heritage gold. In IFLA Satellite Meeting - Digital Humanities -- Opportunities and Risks : Connecting Libraries and Research
work page 2017
-
[4]
David M. Blei. 2012. https://doi.org/10.1145/2133806.2133826 Probabilistic topic models . Commun. ACM, 55(4):77--84
-
[5]
Katherine Bode. 2018. A World of Fiction : Digital Collections and the Future of Literary History . University of Michigan Press, Ann Arbor
work page 2018
-
[6]
Annelen Brunner, Stefan Engelberg, Fotis Jannidis, Ngoc Duyen Tanja Tu, and Lukas Weimer. 2020. Corpus REDEWIEDERGABE . In Proceedings of the Twelfth Language Resources and Evaluation Conference , pages 803--812, Marseille, France. European Language Resources Association
work page 2020
-
[7]
Simon Burrows and Mark Curran. 2012. The French Book Trade in Enlightenment Europe Project and the STN Database . Journal of Digital Humanities, 1(3)
work page 2012
-
[8]
Daniel Carey. 2006. https://doi.org/10.1017/CBO9780511490453 Locke, Shaftesbury , and Hutcheson : Contesting Diversity in the Enlightenment and Beyond . Ideas in Context . Cambridge University Press, Cambridge
-
[9]
Jeffrey R. Collins. 2020. In the Shadow of Leviathan : John Locke and the Politics of Conscience . Cambridge University Press. Google-Books-ID: 1tLKDwAAQBAJ
work page 2020
-
[10]
Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. 2025. https://doi.org/10.1109/TBDATA.2025.3618474 The FAISS Library . IEEE Transactions on Big Data, pages 1--17
-
[11]
Marten D \"u ring, Matteo Romanello, Maud Ehrmann, Kaspar Beelen, Daniele Guido, Brecht Deseure, Estelle Bunout, Jana Keck, and Petros Apostolopoulos. 2023. https://doi.org/10.3389/fdata.2023.1249469 Impresso Text Reuse at Scale . An interface for the exploration of text reuse data in semantically enriched historical newspapers . Frontiers in Big Data, 6
-
[12]
Dan Edelstein. 2016. https://doi.org/10.1017/S1479244314000833 Intellectual History and Digital Humanities . Modern Intellectual History, 13(1):237--246
-
[13]
Dan Edelstein, Paula Findlen, Giovanna Ceserani, Caroline Winterer, and Nicole Coleman. 2017. https://doi.org/10.1093/ahr/122.2.400 Historical Research in a Digital Age : Reflections from the Mapping the Republic of Letters Project . The American Historical Review, 122(2):400--424
-
[14]
Robinson, Marc Alexander, Iona C
Susan Fitzmaurice, Justyna A. Robinson, Marc Alexander, Iona C. Hine, Seth Mehl, and Fraser Dallachy. 2017. https://doi.org/10.1080/00393274.2017.1333891 Linguistic DNA : Investigating Conceptual Change in Early Modern English Discourse . Studia Neophilologica, 89(sup1):21--38
-
[15]
Brevin Franklin, Emily Silcock, Abhishek Arora, Tom Bryan, and Melissa Dell. 2024. https://doi.org/10.18653/v1/2024.nlpcss-1.8 News Deja Vu : Connecting Past and Present with Semantic Search . In Proceedings of the Sixth Workshop on Natural Language Processing and Computational Social Science ( NLP + CSS 2024) , pages 99--112, Mexico City, Mexico. Associa...
-
[16]
Mario Giulianelli, Marco Del Tredici, and Raquel Fern \'a ndez. 2020. https://doi.org/10.18653/v1/2020.acl-main.365 Analysing lexical semantic change with contextualised word representations . In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3960--3973, Online. Association for Computational Linguistics
-
[17]
Andrew Goldstone and Ted Underwood. 2014. The Quiet Transformations of Literary Studies : What Thirteen Thousand Scholars Could Tell Us . New Literary History, 45(3):359--384
work page 2014
-
[18]
Tyler Hanck. 2019. Locke's Confusion About the Confused Idea of Substance . Thesis, University of Illinois Chicago
work page 2019
-
[19]
James A. Harris. 2023. https://doi.org/10.1080/17496977.2022.2147475 Of the origin of government: The afterlives of Locke and Filmer in an eighteenth-century British debate . Intellectual History Review, 33(1):33--55
-
[20]
Hill, Ville Vaara, Tanja S \"a ily, Leo Lahti, and Mikko Tolonen
Mark J. Hill, Ville Vaara, Tanja S \"a ily, Leo Lahti, and Mikko Tolonen. 2019. Reconstructing Intellectual Networks : From the ESTC 's bibliographic metadata to historical material. In Proceedings of the Digital Humanities in the Nordic Countries , Copenhagen, Denmark
work page 2019
-
[21]
Matthew Honnibal, Ines Montani, Sofie Van Landeghem, and Adriane Boyd. 2020. https://doi.org/10.5281/zenodo.1212303 spaCy : Industrial-strength natural language processing in python
-
[22]
Jenna Kanerva, Hanna Kitti, Li-Hsin Chang, Teemu Vahtola, Mathias Creutz, and Filip Ginter. 2025. https://doi.org/10.1007/s10579-023-09715-7 Semantic search as extractive paraphrase span detection . Language Resources and Evaluation, 59(1):257--276
-
[23]
Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. https://doi.org/10.18653/v1/2020.emnlp-main.550 Dense Passage Retrieval for Open-Domain Question Answering . In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing ( EMNLP ) , pages 6769--6781, Online. ...
-
[24]
Pierre-Carl Langlais. 2021. https://doi.org/10.5281/zenodo.4751204 Fictions littéraires de Gallica / Literary fictions of Gallica
-
[25]
Daniel Layman. 2021. Locke's Republican and Liberal Legacy . In The Lockean Mind . Routledge. Num Pages: 10
work page 2021
-
[26]
Angeliki Lazaridou, Adhi Kuncoro, Elena Gribovskaya, Devang Agrawal, Adam Liska, Tayfun Terzi, Mai Gimenez, Cyprien de Masson d' Autume , Tomas Kocisky, Sebastian Ruder, Dani Yogatama, Kris Cao, Susannah Young, and Phil Blunsom. 2021. Mind the Gap : Assessing Temporal Generalization in Neural Language Models . In Advances in Neural Information Processing ...
work page 2021
-
[27]
H. P. Luhn. 1960. https://doi.org/10.1002/asi.5090110403 Key word-in-context index for technical literature (kwic index) . American Documentation, 11(4):288--295
-
[28]
Sean MacAvaney, Sergey Feldman, Nazli Goharian, Doug Downey, and Arman Cohan. 2022. https://doi.org/10.1162/tacl_a_00457 ABNIRML : Analyzing the Behavior of Neural IR Models . Transactions of the Association for Computational Linguistics, 10:224--239
-
[29]
Ananth Mahadevan, Michael Mathioudakis, Eetu M \"a kel \"a , and Mikko Tolonen. 2025. https://doi.org/10.1007/s41060-025-00742-x Text reuse in large historical corpora: Insights from the optimization of a data science system . International Journal of Data Science and Analytics, 20(5):4631--4643
-
[30]
Barbara McGillivray, Federico Nanni, and Kaspar Beelen. 2024. 10. Why Does Digital History Need Diachronic Semantic Search ? In Computational Humanities . University of Minnesota Press
work page 2024
-
[31]
Andrianos Michail, Juri Opitz, Yining Wang, Robin Meister, Rico Sennrich, and Simon Clematide. 2025. https://doi.org/10.18653/v1/2025.findings-acl.609 Cheap Character Noise for OCR-Robust Multilingual Embeddings . In Findings of the Association for Computational Linguistics : ACL 2025 , pages 11705--11716, Vienna, Austria. Association for Computational Li...
-
[32]
Grace Muzny, Michael Fang, Angel Chang, and Dan Jurafsky. 2017. A Two-stage Sieve Approach for Quote Attribution . In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics : Volume 1, Long Papers , pages 460--470, Valencia, Spain. Association for Computational Linguistics
work page 2017
-
[33]
Eetu Mäkelä, James Misson, Devani Singh, and Mikko Tolonen. 2025. https://doi.org/10.1093/llc/fqaf086 Opening the Black Box of EEBO . Digital Scholarship in the Humanities, page fqaf086
-
[34]
R. Porter. 2001. Enlightenment: Britain and the Creation of the Modern World . Penguin Books Limited
work page 2001
-
[35]
Nils Reimers and Iryna Gurevych. 2019. https://doi.org/10.18653/v1/D19-1410 Sentence- BERT : Sentence Embeddings using Siamese BERT - Networks . In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing ( EMNLP - IJCNLP ) , pages 3982--3992, Hong Kong...
-
[36]
Nils Reimers and Iryna Gurevych. 2020. https://doi.org/10.18653/v1/2020.emnlp-main.365 Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation . In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing ( EMNLP ) , pages 4512--4525, Online. Association for Computational Linguistics
-
[37]
Hammad Rizwan, Domenic Rosati, Ga Wu, and Hassan Sajjad. 2025. Resolving Lexical Bias in Model Editing . In Forty-Second International Conference on Machine Learning
work page 2025
-
[38]
Glenn Roe. 2024. https://doi.org/10.61147/des.23 Text reuse as cultural practice: Intertextuality in the 18th-century digital archive . Digital Enlightenment Studies, 2(1)
-
[39]
David Rosson, Eetu Mäkelä, Ville Vaara, Ananth Mahadevan, Yann Ryan, and Mikko Tolonen. 2023. https://doi.org/10.5334/johd.101 Reception Reader : Exploring Text Reuse in Early Modern British Publications . Journal of Open Humanities Data, 9:5. ArXiv:2302.04084 [cs]
-
[40]
Julian Schelb, Michael Wittweiler, Marie Revellio, Barbara Feichtinger, and Andreas Spitz. 2026. https://doi.org/10.48550/arXiv.2601.07533 Loci Similes : A Benchmark for Extracting Intertextualities in Latin Literature . Preprint, arXiv:2601.07533
-
[41]
Christopher Sciavolino, Zexuan Zhong, Jinhyuk Lee, and Danqi Chen. 2021. https://doi.org/10.18653/v1/2021.emnlp-main.496 Simple Entity-Centric Questions Challenge Dense Retrievers . In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing , pages 6138--6148, Online and Punta Cana, Dominican Republic. Association for Comput...
-
[42]
Smith, Ryan Cordel, Elizabeth Maddock Dillon, Nick Stramp, and John Wilkerson
David A. Smith, Ryan Cordel, Elizabeth Maddock Dillon, Nick Stramp, and John Wilkerson. 2014. https://doi.org/10.1109/JCDL.2014.6970166 Detecting and modeling local text reuse . In IEEE / ACM Joint Conference on Digital Libraries , pages 183--192
-
[43]
Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2020. MPNet : Masked and permuted pre-training for language understanding. In Proceedings of the 34th International Conference on Neural Information Processing Systems , NIPS '20, pages 16857--16867, Red Hook, NY, USA. Curran Associates Inc
work page 2020
-
[44]
Peter M. Stahl. 2021. Lingua : The most accurate natural language detection library for Python . https://github.com/pemistahl/lingua-py. Python bindings for the Lingua language detection library
work page 2021
-
[45]
Timothy Stanton. 2018. https://doi.org/10.1017/S0018246X17000450 John Locke and the Fable of Liberalism . The Historical Journal, 61(3):597--622
-
[46]
Nandan Thakur, Nils Reimers, Andreas R \"u ckl \'e , Abhishek Srivastava, and Iryna Gurevych. 2021. BEIR : A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models . In Thirty-Fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track ( Round 2)
work page 2021
-
[47]
Iiro Tiihonen and Kira Hinderks. 2025. https://doi.org/10.46298/transformations.14754 Genre Classification Workflow for the English Short Title Catalogue ( ESTC ) . Transformations: A DARIAH Journal, Workflows(Metadata-based workflows):14754
- [48]
-
[49]
Mikko Tolonen, Eetu Mäkelä, and Leo Lahti. 2022. https://muse.jhu.edu/pub/1/article/867734 The Anatomy of Eighteenth Century Collections Online ( ECCO ) . Eighteenth-Century Studies, 56(1):95--123. Publisher: Johns Hopkins University Press
work page 2022
-
[50]
Mikko Tolonen and Yann Ciar \'a n Ryan. 2026. Computational Methods in Intellectual History, pages 239--260. Proceedings of the British Academy. Liverpool University Press, United Kingdom
work page 2026
-
[51]
Mikko Tolonen and Mark G. Spencer. 2025. https://doi.org/10.1017/9781009047227.003 The Reception of David Hume ’s Essays in Eighteenth - Century Britain . In Max Skjönsberg and Felix Waldmann, editors, Hume's Essays , Cambridge Critical Guides , pages 15--35. Cambridge University Press, Cambridge
-
[52]
Aleksi Vesanto, Filip Ginter, Hannu Salmi, Asko Nivala, and Tapio Salakoski. 2017. A System for Identifying and Exploring Text Repetition in Large Historical Document Corpora . In Proceedings of the 21st Nordic Conference on Computational Linguistics , pages 330--333, Gothenburg, Sweden. Association for Computational Linguistics
work page 2017
-
[53]
Warren, Daniel Shore, Jessica Otis, Lawrence Wang, Mike Finegold, and Cosma Shalizi
Christopher N. Warren, Daniel Shore, Jessica Otis, Lawrence Wang, Mike Finegold, and Cosma Shalizi. 2016. Six Degrees of Francis Bacon: A Statistical Method for Reconstructing Large Historical Social Networks. DHQ: Digital Humanities Quarterly, 10(3):1
work page 2016
-
[54]
Junyuan Zhang, Qintong Zhang, Bin Wang, Linke Ouyang, Zichen Wen, Ying Li, Ka-Ho Chow, Conghui He, and Wentao Zhang. 2025. OCR Hinders RAG : Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation . In Proceedings of the IEEE / CVF International Conference on Computer Vision , pages 17443--17453
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.