pith. machine review for the scientific record. sign in

arxiv: 2604.06232 · v1 · submitted 2026-04-02 · 💻 cs.DL · cs.IR

Recognition: 1 theorem link

· Lean Theorem

What Do Humanities Scholars Need? A User Model for Recommendation in Digital Archives

Dominik Kowald, Florian Atzenhofer-Baumgartner

Pith reviewed 2026-05-13 20:07 UTC · model grok-4.3

classification 💻 cs.DL cs.IR
keywords recommender systemsuser modelingdigital archiveshumanities scholarsinformation seekingcontext volatilityepistemic trustresearch strands
0
0 comments X

The pith

Humanities scholars need recommender systems built around shifting research contexts, provenance trust, contrastive items, and long-term strands rather than stable preferences and short sessions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates whether standard recommender system user models, built on assumptions of stable tastes and session-limited interactions, fit humanities researchers using digital archives. Through focus groups and interviews with 18 scholars, it identifies four specific divergences: preferences change with tasks and expertise, relevance hinges on verifiable provenance, researchers actively seek materials that challenge their current views, and work continues across extended research threads instead of isolated sessions. These differences matter because they imply that off-the-shelf collaborative filtering or content-based methods may misalign with scholarly goals. The authors position the four dimensions as a diagnostic framework for adapting recommendation approaches in archives and similar domains.

Core claim

User models for recommendation in digital archives must incorporate four dimensions where scholarly information-seeking diverges from common RecSys assumptions: context volatility, where preferences shift with research tasks and domain expertise; epistemic trust, where relevance depends on verifiable provenance; contrastive seeking, where researchers pursue items that challenge their current direction; and strand continuity, where research spans long-term threads rather than discrete sessions. The dimensions are derived from qualitative analysis of focus groups and interviews and are discussed in relation to collaborative filtering, content-based, and session-based recommendation techniques.

What carries the argument

The diagnostic framework of four dimensions (context volatility, epistemic trust, contrastive seeking, and strand continuity) that identifies mismatches between standard RecSys user modeling and humanities scholarly practices.

If this is right

  • Recommendation algorithms should maintain persistent profiles of long-term research strands instead of resetting per session.
  • Systems need to surface provenance metadata prominently to support epistemic trust judgments.
  • Models must include mechanisms for delivering contrastive or challenging items rather than only similarity-based matches.
  • User models require dynamic adaptation to changes in research task and expertise level.
  • The same four dimensions can serve as a diagnostic checklist for recommendation design in other low-volume, high-expertise domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Hybrid systems could combine short-term session signals with persistent strand tracking to improve relevance over months or years.
  • The framework might transfer to social science or law archives where provenance and challenge-seeking are also central.
  • Quantifying the impact of these dimensions on user satisfaction would require controlled A/B tests in live archive interfaces.

Load-bearing premise

Insights from focus groups and interviews with 18 researchers generalize to the broader population of humanities scholars without larger-scale quantitative validation or testing across subfields.

What would settle it

A quantitative survey of several hundred humanities scholars across multiple subfields that finds no measurable evidence of context volatility, epistemic trust effects, contrastive seeking, or strand continuity in their archive interactions would falsify the central claim.

read the original abstract

User models for recommender systems (RecSys) typically assume stable preferences, similarity-based relevance, and session-bounded interactions -- assumptions derived from high-volume consumer contexts. This paper investigates these assumptions for humanities scholars working with digital archives. Following a human-centered design approach, we conducted focus groups and analyzed interview data from 18 researchers. Our analysis identifies four dimensions where scholarly information-seeking diverges from common RecSys user modeling: (1) context volatility -- preferences shift with research tasks and domain expertise; (2) epistemic trust -- relevance depends on verifiable provenance; (3) contrastive seeking -- researchers seek items that challenge their current direction; and (4) strand continuity -- research spans long-term threads rather than discrete sessions. We discuss implications for user modeling and outline how these dimensions relate to collaborative filtering, content-based, and session-based recommendation. We propose these dimensions as a diagnostic framework applicable beyond archives to similar application domains where typical user modeling assumptions may not hold.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that a human-centered qualitative analysis of focus groups and interviews with 18 humanities researchers reveals four systematic divergences from standard RecSys user-model assumptions—context volatility (preferences shift with tasks and expertise), epistemic trust (relevance tied to verifiable provenance), contrastive seeking (preference for challenging items), and strand continuity (long-term research threads over sessions)—and proposes these as a diagnostic framework for user modeling in digital archives and similar domains.

Significance. If the dimensions prove robust beyond the sample, the work offers a useful corrective to consumer-derived RecSys assumptions when applied to scholarly archives, potentially guiding more appropriate modeling choices for collaborative filtering, content-based, and session-based approaches in low-volume, high-expertise settings.

major comments (2)
  1. [Methods] Methods section: the account of interview protocols, participant selection criteria, coding procedures, and any reliability checks (e.g., inter-coder agreement) is too brief to allow evaluation of how the four dimensions were reliably extracted from the 18-participant data.
  2. [Results/Discussion] Results and Discussion: the claim that the four dimensions constitute a generalizable diagnostic framework rests on an untested assumption of representativeness; no evidence is provided on subfield coverage, data saturation, or triangulation with existing literature on scholarly search behavior, leaving the central claim vulnerable to sample-specific artifacts.
minor comments (1)
  1. [Abstract/Introduction] Abstract and §1: the phrasing 'focus groups and analyzed interview data' is ambiguous; clarify whether focus groups were a distinct data source or part of the interview process.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important areas for strengthening the manuscript. We address each major comment below and indicate planned revisions.

read point-by-point responses
  1. Referee: [Methods] Methods section: the account of interview protocols, participant selection criteria, coding procedures, and any reliability checks (e.g., inter-coder agreement) is too brief to allow evaluation of how the four dimensions were reliably extracted from the 18-participant data.

    Authors: We agree that the Methods section requires expansion for transparency and replicability. In the revised manuscript we will add a detailed subsection describing the semi-structured interview protocol (including sample questions and focus-group facilitation), explicit participant selection criteria (purposive sampling across humanities disciplines with attention to career stage and institutional affiliation), the inductive thematic coding process (following Braun and Clarke's six-phase approach), and reliability steps (two independent coders reviewed 25% of transcripts, with discrepancies resolved through discussion and a final codebook audit). revision: yes

  2. Referee: [Results/Discussion] Results and Discussion: the claim that the four dimensions constitute a generalizable diagnostic framework rests on an untested assumption of representativeness; no evidence is provided on subfield coverage, data saturation, or triangulation with existing literature on scholarly search behavior, leaving the central claim vulnerable to sample-specific artifacts.

    Authors: The referee correctly notes the absence of explicit discussion on these points. Our study is qualitative and exploratory; we do not claim statistical generalizability but present the four dimensions as a diagnostic framework to surface divergences from consumer-derived RecSys assumptions. In revision we will insert a Limitations subsection that (a) reports subfield coverage (participants spanned history, literature, philosophy, and art history), (b) states that thematic saturation was reached after the 14th interview with no new codes emerging, and (c) triangulates findings against prior work on scholarly information seeking (e.g., Ellis, Meho & Tibbo). We will also revise the abstract and conclusion to frame the dimensions as a starting point for future validation rather than a fully generalizable model. revision: partial

Circularity Check

0 steps flagged

No circularity: framework derived inductively from interview data

full rationale

The paper presents its four dimensions as the direct result of qualitative analysis of focus groups and interviews with 18 humanities researchers. No equations, fitted parameters, self-citations, or derivations appear in the provided text. The central claims do not reduce to inputs by construction, nor do they rely on load-bearing self-citations or renamed known results. This is a standard empirical user-study approach that remains self-contained without circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a qualitative empirical study based on user interviews with no mathematical components, free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5468 in / 1054 out tokens · 44466 ms · 2026-05-13T20:07:22.039432+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

  1. [1]

    Geiger, Christoph Trattner, Georg Vogeler, and Dominik Kowald

    Florian Atzenhofer-Baumgartner, Bernhard C. Geiger, Christoph Trattner, Georg Vogeler, and Dominik Kowald. 2024. Challenges in Implementing a Recommender System for Historical Research in the Humanities.Arxiv.orgabs/2410.20909 (Oct. 2024). doi:10.48550/arXiv.2410.20909

  2. [2]

    Geiger, Georg Vogeler, and Do- minik Kowald

    Florian Atzenhofer-Baumgartner, Bernhard C. Geiger, Georg Vogeler, and Do- minik Kowald. 2024. Value Identification in Multistakeholder Recommender Systems for Humanities and Historical Research: The Case of the Digital Archive Monasterium.Net. arXiv:2409.17769 (Sept. 2024). doi:10.48550/arXiv.2409.17769

  3. [3]

    Florian Atzenhofer-Baumgartner, Georg Vogeler, and Dominik Kowald. 2025. A Multistakeholder Approach to Value-Driven Co-design of Recommender Systems Evaluation Metrics in Digital Archives. InProceedings of the Nineteenth ACM Conference on Recommender Systems. ACM, Prague Czech Republic, 503–508. doi:10.1145/3705328.3748026

  4. [4]

    Marcia J. Bates. 1989. The design of browsing and berrypicking techniques for the online search interface.Online Review13, 5 (May 1989), 407–424. doi:10. 1108/eb024320

  5. [5]

    Christine Bauer, Eva Zangerle, and Alan Said. 2024. Exploring the Landscape of Recommender Systems Evaluation: Practices and Perspectives.ACM Transactions on Recommender Systems2, 1 (March 2024), 1–31. doi:10.1145/3629170

  6. [6]

    Brett Binst, Lien Michiels, and Annelien Smets. 2025. What Is Serendipity? An Interview Study to Conceptualize Experienced Serendipity in Recommender Systems. InProceedings of the 33rd ACM Conference on User Modeling, Adaptation and Personalization. ACM, New York City USA, 243–252. doi:10.1145/3699682. 3728325

  7. [7]

    Robin Burke, Gediminas Adomavicius, Toine Bogers, Tommaso Di Noia, Dominik Kowald, Julia Neidhardt, Özlem Özgöbek, Maria Soledad Pera, and Jürgen Ziegler

  8. [8]

    Dagstuhl Seminar on Evaluation Perspectives of Recommender Systems: Multistakeholder and Multimethod Evaluation.Dagstuhl Report on Evaluation Perspectives of Recommender Systems: Driving Research and Education(2024)

  9. [9]

    Robin Burke, Gediminas Adomavicius, Toine Bogers, Tommaso Di Noia, Dominik Kowald, Julia Neidhardt, Özlem Özgöbek, Maria Soledad Pera, Nava Tintarev, and Jürgen Ziegler. 2025. De-centering the (Traditional) User: Multistakeholder Evaluation of Recommender Systems. arXiv:2501.05170 (Jan. 2025). doi:10.48550/ arXiv.2501.05170 arXiv:2501.05170

  10. [10]

    Mario Casillo, Francesco Colace, Domenico Conte, Marco Lombardi, Domenico Santaniello, and Carmine Valentino. 2023. Context-aware recommender systems and cultural heritage: a survey.Journal of Ambient Intelligence and Humanized Computing14, 6 (2023), 7427–7458. doi:10.1007/s12652-021-03438-9

  11. [11]

    Chassanoff

    Alexandra M. Chassanoff. 2018. Historians’ Experiences Using Digitized Archival Photographs as Evidence.The American Archivist81, 1 (March 2018), 135–164. doi:10.17723/0360-9081-81.1.135

  12. [12]

    So How Do We Balance All of These Needs?

    Amber L. Cushing and Giulia Osti. 2023. “So How Do We Balance All of These Needs?”: How the Concept of AI Technology Impacts Digital Archival Expertise. Journal of Documentation79, 7 (Dec. 2023), 12–29. doi:10.1108/JD-08-2022-0170

  13. [13]

    Ekstrand, Afsaneh Razi, Aleksandra Sarcevic, Maria Soledad Pera, Robin Burke, and Katherine Landau Wright

    Michael D. Ekstrand, Afsaneh Razi, Aleksandra Sarcevic, Maria Soledad Pera, Robin Burke, and Katherine Landau Wright. 2025. Recommending With, Not For: Co-Designing Recommender Systems for Social Good. arXiv:2508.03792 doi:10.48550/arXiv.2508.03792

  14. [14]

    Yingqiang Ge, Shuchang Liu, Zuohui Fu, Juntao Tan, Zelong Li, Shuyuan Xu, Yunqi Li, Yikun Xian, and Yongfeng Zhang. 2025. A Survey on Trustworthy Recommender Systems.ACM Transactions on Recommender Systems3, 2 (june 2025), 1–68. doi:10.1145/3652891

  15. [15]

    They Only Offer the Illusion of Choice

    Muheeb Faizan Ghori, Arman Dehpanah, Jonathan Gemmell, and Bamshad Mobasher. 2025. “They Only Offer the Illusion of Choice”: Exploring User Per- ceptions of Control and Agency on YouTube. InAdjunct Proceedings of the 33rd ACM Conference on User Modeling, Adaptation and Personalization. ACM, New York City USA, 214–218. doi:10.1145/3708319.3733664

  16. [16]

    Zhicheng He, Weiwen Liu, Wei Guo, Jiarui Qin, Yingxue Zhang, Yaochen Hu, and Ruiming Tang. 2023. A Survey on User Behavior Modeling in Recommender Systems. arXiv:2302.11087 doi:10.48550/arXiv.2302.11087

  17. [17]

    Rully Agus Hendrawan, Peter Brusilovsky, Arun Balajiee Lekshmi Narayanan, and Jordan Barria-Pineda. 2024. Explanations in Open User Models for Personal- ized Information Exploration. InAdjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization. ACM, Cagliari Italy, 256–263. doi:10.1145/3631700.3665188

  18. [18]

    Anastasiia Klimashevskaia, Dietmar Jannach, Mehdi Elahi, and Christoph Trat- tner. 2024. A Survey on Popularity Bias in Recommender Systems.User Modeling and User-Adapted Interaction34, 5 (2024), 1777–1834. doi:10.1007/s11257-024- 09406-0

  19. [19]

    Thomas Elmar Kolb, Irina Nalis, and Julia Neidhardt. 2025. Bridging Preferences: Multi-Stakeholder Insights on Ideal News Recommendations. InProceedings of the 33rd ACM Conference on User Modeling, Adaptation and Personalization. ACM, New York City USA, 268–272. doi:10.1145/3699682.3728355

  20. [20]

    Ivica Kostric, Krisztian Balog, and Ujwal Gadiraju. 2025. Should We Tailor the Talk? Understanding the Impact of Conversational Styles on Preference Elicitation in Conversational Recommender Systems. InProceedings of the 33rd ACM Conference on User Modeling, Adaptation and Personalization. ACM, New York City USA, 164–173. doi:10.1145/3699682.3728353

  21. [21]

    Elina Late, Hille Ruotsalainen, and Sanna Kumpulainen. 2023. In a Perfect World: Exploring the Desires and Realities for Digitized Historical Image Archives. Proceedings of the Association for Information Science and Technology60, 1 (Oct. 2023), 244–254. doi:10.1002/pra2.785

  22. [22]

    Wenqi Li, Jui-Ching Kuo, Manyu Sheng, Pengyi Zhang, and Qunfang Wu. 2025. Beyond Explicit and Implicit: How Users Provide Feedback to Shape Personalized Recommendation Content. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. ACM, Yokohama Japan, 1–17. doi:10.1145/3706598. 3713241

  23. [23]

    Yueru Li, Jin Zhang, and Jue Wang. 2024. A systematic review of information- seeking behavior of humanities scholars in the digital environment.Journal of Documentation80, 7 (2024), 1–25. doi:10.1108/JD-01-2024-0015

  24. [24]

    Jianxun Lian, Iyad Batal, Zheng Liu, Akshay Soni, Eun Yong Kang, Yajun Wang, and Xing Xie. 2021. Multi-Interest-Aware User Modeling for Large-Scale Sequen- tial Recommendations. arXiv:2102.09211 (May 2021). doi:10.48550/arXiv.2102. 09211

  25. [25]

    Matusiak

    Krystyna K. Matusiak. 2022. Evaluating a Digital Community Archive from the User Perspective: The Case of Formative Multifaceted Evaluation.Library & Information Science Research44, 3 (July 2022), 101159. doi:10.1016/j.lisr.2022. 101159

  26. [26]

    Marta Moscati, Darius Afchar, Markus Schedl, and Bruno Sguerra. 2025. Famil- iarizing with Music: Discovery Patterns for Different Music Discovery Needs. InProceedings of the 33rd ACM Conference on User Modeling, Adaptation and Personalization. ACM, New York City USA, 63–72. doi:10.1145/3699682.3728333

  27. [27]

    Irina Nalis, Tobias Sippl, Thomas Elmar Kolb, and Julia Neidhardt. 2024. Navi- gating Serendipity - An Experimental User Study On The Interplay of Trust and Serendipity In Recommender Systems. InAdjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization. ACM, Cagliari Italy, 386–393. doi:10.1145/3631700.3664901

  28. [28]

    Trevor Owens and Thomas Padilla. 2021. Digital Sources and Digital Archives: Historical Evidence in the Digital Age.International Journal of Digital Humanities 1, 3 (July 2021), 325–341. doi:10.1007/s42803-020-00028-7

  29. [29]

    Erasmo Purificato, Ludovico Boratto, and Ernesto William De Luca. 2024. User Modeling and User Profiling: A Comprehensive Survey. arXiv:2402.09660 doi:10. 48550/arXiv.2402.09660

  30. [30]

    Ekstrand

    Alan Said, Maria Soledad Pera, and Michael D. Ekstrand. 2025. We’re Still Doing It (All) Wrong: Recommender Systems, Fifteen Years Later. arXiv:2509.09414 doi:10.48550/arXiv.2509.09414

  31. [31]

    Reijo Savolainen. 2018. Berrypicking and information foraging: Comparison of two theoretical frameworks for studying exploratory search.Journal of Information Science44, 5 (Oct. 2018), 580–593. doi:10.1177/0165551517713168

  32. [32]

    Donghee Sinn and Nicholas Soares. 2014. Historians’ Use of Digital Archival Collections: The Web, Historical Scholarship, and Archival Research.Journal of the Association for Information Science and Technology65, 9 (Sept. 2014), 1794–

  33. [33]

    doi:10.1002/asi.23091

  34. [34]

    Trace and Unmil P

    Ciaran B. Trace and Unmil P. Karadkar. 2017. Information Management in the Humanities: Scholarly Processes, Tools, and the Construction of Personal Collections.Journal of the Association for Information Science and Technology68, 2 (Feb. 2017), 491–507. doi:10.1002/asi.23678

  35. [35]

    Shoujin Wang, Qi Zhang, Liang Hu, Xiuzhen Zhang, Yan Wang, and Charu Aggarwal. 2022. Sequential/Session-Based Recommendations: Challenges, Ap- proaches, Applications and Opportunities. InProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Madrid Spain, 3425–3428. doi:10.1145/3477495.3532685

  36. [36]

    Kathrin Wardatzky, Oana Inel, Luca Rossetto, and Abraham Bernstein. 2025. Whom Do Explanations Serve? A Systematic Literature Survey of User Charac- teristics in Explainable Recommender Systems Evaluation.ACM Transactions on Recommender Systems(Feb. 2025), 3716394. doi:10.1145/3716394

  37. [37]

    Sonia Yaco, Bala Desinghu, Claire Warwick, and Richard Anderson. 2025. What Can AI Do for Special Collections?The American Archivist88, 2 (2025), 441–473. doi:10.17723/2327-9702-88.2.441

  38. [38]

    Yuxiang Chris Zhao, Jingwen Lian, Yan Zhang, Shijie Song, and Xinlin Yao. 2024. Value Co-creation in Cultural Heritage Information Practices.Journal of the Association for Information Science and Technology75, 3 (March 2024), 298–323. doi:10.1002/asi.24862