pith. machine review for the scientific record. sign in

arxiv: 2604.20528 · v1 · submitted 2026-04-22 · 💻 cs.DL · cs.CY

Recognition: unknown

Evolution of Research Method Usage Across the Academic Careers of Library and Information Science Scholars

Authors on Pith no claims yet

Pith reviewed 2026-05-09 23:23 UTC · model grok-4.3

classification 💻 cs.DL cs.CY
keywords library and information scienceresearch methodsacademic careersbibliometric methodsmethod diversityfull-text analysisscholarly publishingtopic modeling
0
0 comments X

The pith

Library and information science scholars rely more on bibliometric methods later in their careers while overall method diversity rises then falls.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper follows 435 senior LIS scholars across more than 14 years of publishing to map how they choose research methods. It shows bibliometric techniques rising from roughly one-fifth of early papers to nearly one-third of later ones, with the spread of different methods expanding at first and then contracting. Scholars routinely combine several methods at once, sometimes pairing standard and less common approaches, and the specific methods that dominate shift as academic age increases. The work rests on an automated full-text classifier that sorts papers into sixteen method categories and twenty topics, then links those labels to each author's publication timeline.

Core claim

Bibliometric methods are the single most frequent category at every career stage, growing from 19.61 percent among early-career scholars to 31.81 percent among seniors; method diversity follows an inverted-U pattern over academic age; researchers combine multiple methods, including conventional and unconventional pairings; and the leading methods themselves change with seniority.

What carries the argument

Automated full-text classification model that assigns each article to one of sixteen research-method categories, paired with author-name disambiguation and academic-age calculation from publication records.

If this is right

  • Bibliometric methods become steadily more central as scholars gain seniority.
  • The range of methods a scholar draws on expands early then narrows later.
  • Researchers frequently use more than one method per study and mix familiar with less common ones.
  • The identity of the most-used methods shifts measurably with academic age.
  • Patterns in method choice can be tracked continuously across long careers using full-text data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same rise-then-fall diversity curve might appear in other disciplines if the classification model generalizes.
  • Later-career narrowing of methods could reflect growing specialization around particular research questions.
  • Early-career exposure to a wider method set might reduce later narrowing if deliberately encouraged.
  • Topic modeling combined with method labels could reveal whether certain questions drive the observed method shifts.

Load-bearing premise

The automated classifier correctly labels every paper's research methods without systematic mistakes that differ by career stage or research topic.

What would settle it

A hand-coded sample of one hundred early-career and one hundred senior papers whose method labels are compared against the model's output to check for accuracy differences by academic age.

Figures

Figures reproduced from arXiv: 2604.20528 by Chengzhi Zhang, Jiayi Hao.

Figure 1
Figure 1. Figure 1: Framework of this study 3.2 Data sources To ensure data quality, this study selected representative academic journals from the LIS field. In prior research, Järvelin and Vakkari (1993) conducted extensive studies on research methods and identified 31 representative academic journals in LIS based on the research topics covered in their articles. Building on this foundation, this study integrates the list of… view at source ↗
read the original abstract

Research methods constitute an indispensable tool for scholars engaged in scientific inquiry. Investigating how scholars use research methods throughout their careers can reveal distinct patterns in method adoption, providing valuable insights for novice researchers in selecting appropriate methods. This study employs a comprehensive dataset comprising full-text journal articles and bibliographic records from the Library and Information Science (LIS) domain. Utilizing an automated classification model based on full-text cognitive analysis, the research methods employed by LIS scholars are systematically identified. Topic modeling was then conducted using Top2Vec. Subsequently, author name disambiguation is performed, and academic age is calculated for each scholar. This study focuses on 435 senior scholars with an academic age of more than 14 years and a consistent publication record at five-year intervals, covering a total of 6,116 articles. The corpus covers 16 research method categories and 20 research topics. The findings indicate that bibliometric methods are the most frequently used across career stages, accounting for 19.61% among early-career scholars and 31.81% among senior scholars. Over the course of a scholarly career, the diversity of research methods initially increases and then declines. Furthermore, scholars exhibit a propensity for combining multiple research methods, including both conventional and unconventional pairings. Notably, the research methods most commonly used by researchers change with age and seniority.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. This paper examines the evolution of research method usage among Library and Information Science (LIS) scholars over their academic careers. Using a dataset of 6,116 full-text articles from 435 senior scholars (academic age >14 years with consistent publications at five-year intervals), it employs an automated classification model based on full-text cognitive analysis to categorize papers into 16 research method categories, applies Top2Vec for topic modeling, performs author disambiguation, and calculates academic age. Key findings include the dominance of bibliometric methods (rising from 19.61% in early-career to 31.81% in senior stages), an initial increase followed by a decline in method diversity, a tendency to combine multiple methods (both conventional and unconventional), and shifts in preferred methods with age and seniority.

Significance. If the automated classification proves reliable and unbiased across career stages and topics, the study offers valuable empirical insights into how research practices evolve in the LIS field. The large-scale analysis of full-text data and focus on longitudinal career patterns could inform mentoring, curriculum design, and understanding of methodological trends in information science. The identification of method combinations and diversity patterns adds nuance to bibliometric studies of scholarly careers. The use of a large curated corpus of senior scholars with consistent records is a positive design choice for longitudinal analysis.

major comments (3)
  1. The automated classification model based on full-text cognitive analysis (described in the abstract and Methods) is load-bearing for every headline result, yet the manuscript reports no accuracy, precision, recall, confusion matrix, training data details, or error rates. There is also no stratified analysis of misclassification risk by academic age or topic, which directly threatens the validity of the reported rise in bibliometric usage (19.61% to 31.81%), the diversity increase-then-decline pattern, and the method-combination findings.
  2. The selection criteria for the 435 scholars (academic age threshold >14 years and consistent publication records at five-year intervals) are not accompanied by sensitivity tests or discussion of how varying these free parameters affects the observed trends in method diversity and combinations; this choice restricts the sample to a non-random subset of LIS scholars and may bias the career-evolution claims.
  3. The reported percentages lack error bars or confidence intervals, and the manuscript contains no discussion of how author disambiguation failures or Top2Vec topic-model instability could propagate into the diversity and combination trends (see abstract and Results).
minor comments (2)
  1. The abstract and text should explicitly list the 16 research method categories and 20 topics for reproducibility.
  2. Clarify the exact operational definition of 'early-career' versus 'senior' stages and how academic age is computed from the disambiguated records.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which identify key areas where additional transparency and robustness checks will improve the manuscript. We address each major comment below and indicate the revisions we will make.

read point-by-point responses
  1. Referee: The automated classification model based on full-text cognitive analysis (described in the abstract and Methods) is load-bearing for every headline result, yet the manuscript reports no accuracy, precision, recall, confusion matrix, training data details, or error rates. There is also no stratified analysis of misclassification risk by academic age or topic, which directly threatens the validity of the reported rise in bibliometric usage (19.61% to 31.81%), the diversity increase-then-decline pattern, and the method-combination findings.

    Authors: We acknowledge that the current version of the manuscript does not report performance metrics or stratified validation for the automated classification model. In the revised manuscript we will add a new subsection in Methods that details the training data, model development, accuracy, precision, recall, F1 scores, and full confusion matrix. We will also perform and report a stratified evaluation of classification performance broken down by academic age bins and by the 20 research topics identified via Top2Vec, explicitly quantifying any differential misclassification risk that could affect the reported trends in bibliometric usage, method diversity, and combinations. revision: yes

  2. Referee: The selection criteria for the 435 scholars (academic age threshold >14 years and consistent publication records at five-year intervals) are not accompanied by sensitivity tests or discussion of how varying these free parameters affects the observed trends in method diversity and combinations; this choice restricts the sample to a non-random subset of LIS scholars and may bias the career-evolution claims.

    Authors: The criteria were chosen to ensure sufficient longitudinal coverage for tracking within-scholar change over career stages. We agree that sensitivity to these thresholds should be demonstrated. The revised manuscript will include sensitivity analyses that vary the academic-age cutoff (e.g., >10 and >18 years) and the publication-consistency requirement, showing the resulting changes in method-diversity curves and combination frequencies. We will also add explicit discussion of how these design choices affect generalizability to the wider population of LIS scholars. revision: yes

  3. Referee: The reported percentages lack error bars or confidence intervals, and the manuscript contains no discussion of how author disambiguation failures or Top2Vec topic-model instability could propagate into the diversity and combination trends (see abstract and Results).

    Authors: We will add bootstrap-derived confidence intervals or standard errors to all percentage figures in the Results section. In the Limitations and Discussion we will explicitly address potential error propagation from author-name disambiguation and Top2Vec topic-model instability, including any robustness checks (e.g., alternative disambiguation thresholds or topic-model hyper-parameter sweeps) that we can perform on the existing data, and we will discuss how such uncertainty could influence the observed diversity and combination patterns. revision: partial

Circularity Check

0 steps flagged

No circularity: purely observational corpus analysis with direct extraction of trends

full rationale

The manuscript performs straightforward empirical processing on a fixed corpus of 6,116 LIS articles: an automated full-text classifier assigns papers to 16 method categories, Top2Vec identifies 20 topics, author disambiguation and academic-age calculation are applied, and then simple frequency counts, diversity metrics, and co-occurrence patterns are tabulated by career stage. No equations, fitted parameters, or predictive models are introduced; the headline percentages (19.61 % early-career bibliometrics rising to 31.81 % for seniors) and the rise-then-fall diversity curve are literal aggregates of the classified labels. No self-citation chain, ansatz smuggling, or renaming of known results occurs; the classifier and topic model are treated as external processing steps whose outputs are counted directly. Because every reported finding reduces only to tabulation of the processed data rather than to any fitted or self-referential construct, the derivation chain contains no circular step.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

Central claims depend on the unverified accuracy of the text classification model and author disambiguation procedure, plus the representativeness of the filtered senior-scholar subset.

free parameters (2)
  • academic age threshold = 14 years
    14 years chosen to define senior scholars with sufficient longitudinal span
  • publication consistency interval = 5 years
    5-year intervals required to ensure trackable career records
axioms (2)
  • domain assumption The full-text cognitive analysis model correctly labels research methods into 16 categories across all career stages and topics
    Invoked when identifying methods from 6116 articles; no performance metrics supplied
  • domain assumption Author name disambiguation accurately links publications to individual scholars for academic age calculation
    Required to compute career trajectories for the 435 scholars

pith-pipeline@v0.9.0 · 5534 in / 1495 out tokens · 47793 ms · 2026-05-09T23:23:13.580880+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 3 canonical work pages

  1. [1]

    Similarly, Nane et al

    cluded that the first publication date is a more suitable indicator of a researcher’s academic age. Similarly, Nane et al. (2017) identified the year of first publication as the best linear predictor of a scholar’s age. Therefore, this study defines the start of an academic career as the date of first publication. Academic age is commonly studied alongsid...

  2. [2]

    From an application perspective, academic age has been widely used to study various research behaviors. Most existing studies examine teamwork, career dynamics, and scientific networks by analyzing data from scientometrics and network science to interpret scientists' cognitive processes and behavioral patterns (Krauss, 2024). Table

  3. [3]

    high peak

    Different research perspectives integrating scholars' academic careers Authors Perspective Main findings Milojević (2012) Reference citation behaviour Similar citation behavior with senior and junior researchers citing references at comparable rates and consistent re-citation patterns Aref et al. (2019) Researcher mobility Hypermobility analysis categoriz...

  4. [4]

    (2021) investigated how researchers in different age groups employ research methods over time

    Lou et al. (2021) investigated how researchers in different age groups employ research methods over time. Järvelin and Vakkari (2021) expanded on their earlier work by summarizing the methodological evolution in LIS over the past 50 years, noting that LIS research has become increasingly methodologically diverse, with more varied approaches to analyzing r...

  5. [5]

    Increasing method diversity with data analysis and qualitative methods dominating recent publications Lou et al

    Studies on the use of research methods Authors Perspective Main findings Järvelin and Vakkari (1990) Classification of research methods Systematic categorization of research methods into 9 strategies and 10 data collection techniques Chu (2015) Classification of research methods LIS research methods classified into 16 categories based on data collection H...

  6. [6]

    Framework of this study 3.2 Data sources To ensure data quality, this study selected representative academic journals from the LIS field. In prior research, Järvelin and Vakkari (1993) conducted extensive studies on research methods and identified 31 representative academic journals in LIS based on the research topics covered in their articles. Building o...

  7. [7]

    These documents were then processed and parsed using Python to generate standardized full-text data

    Full-text data were obtained from the official websites of each journal and converted into Word document format using conversion tools. These documents were then processed and parsed using Python to generate standardized full-text data. For cases where metadata were incomplete, bibliographic data for all articles published in the 14 journals over the 34-y...

  8. [8]

    Content analysis Content analysis refers to collecting data by conducting systematic examination of texts or other passages in the contexts of their use

    Classification system of research methods in LIS discipline(Chu&Ke,2017) Method Definition Bibliometrics Bibliometrics is a method used for collecting publication and citation data. Content analysis Content analysis refers to collecting data by conducting systematic examination of texts or other passages in the contexts of their use. Delphi study The Delp...

  9. [9]

    Focus groups As a research method, focus groups refer to data collection via discussion of a research problem between a moderator and a group of participants

    list of research methods. Focus groups As a research method, focus groups refer to data collection via discussion of a research problem between a moderator and a group of participants. Historical method Historical method refers to collecting data by examining, synthesizing, summarizing, and interpreting existing published and unpublished materials related...

  10. [10]

    It is worth noting that, from a data collection perspective, the category of Bibliometrics in this system encompasses citation analysis, informetrics, and scientometrics

    This system can assist LIS researchers in selecting appropriate methods by considering the applicability of each method to specific research questions. It is worth noting that, from a data collection perspective, the category of Bibliometrics in this system encompasses citation analysis, informetrics, and scientometrics. Selection of the classification mo...

  11. [11]

    Subsequently, Self-Attention is employed to capture relationships between different text chunks and predict the probability that each input chunk constitutes a method summary

    to represent it as a 768 dimensional vector. Subsequently, Self-Attention is employed to capture relationships between different text chunks and predict the probability that each input chunk constitutes a method summary. The four text chunks with the highest probabilities are concatenated with the first four chunks of the full text to form the research me...

  12. [12]

    #"""""!"#A

    Classification results of research topics based on academic papers 6055 5260 43244244 4223 BibliometricsExperimentQuestionnaireTheoretical approachContent analysisInterviewHistorical methodOther methodsWebometricsTransaction log analysisObservation 050010001500200025003000 ScientometricsInformation TheoryInformation RetrievalText MiningLibrary ServicesInf...

  13. [13]

    We set the maximum academic age at 61 years (95th percentile)

    Distribution of authors' academic age To analyze the relationship between academic age and method use, we establish a classification. We set the maximum academic age at 61 years (95th percentile). Following prior research (Chowdhary et al., 2024), we define three groups: young scholars (academic age < 7), middle-aged scholars (7–14), and senior scholars (...

  14. [14]

    Among these, bibliometrics has the highest proportion at 30.02%, indicating that this method is the most commonly employed by scholars in the field

    Statistical differences in the frequency of method selection among scholars in different academic age groups Within the specific field, the usage proportions of different research methods exhibit significant variation. Among these, bibliometrics has the highest proportion at 30.02%, indicating that this method is the most commonly employed by scholars in ...

  15. [15]

    In the period of 2015–2024, the color distribution shifts again, with the hues for the 31–50 academic age group becoming lighter

    This indicates that scholars in this range utilized nearly all available types of research methods, reflecting a significant diversification in their methodological approaches. In the period of 2015–2024, the color distribution shifts again, with the hues for the 31–50 academic age group becoming lighter. The trend for the 11–30 academic age group shows t...

  16. [16]

    Since 2000, webometrics has been highly favored by young scholars, and it ranked fifth among the methods used by middle - aged scholars from 2005 to

  17. [17]

    For theoretical approach, the method was highly preferred by scholars at all academic career stages from 1990-2000, with a share of around 20%

    This may be attributed to the fact that young scholars from 2000–2004, as they advanced in age and experience, transitioning into middle - aged scholars, retained their preference for bibliometrics. For theoretical approach, the method was highly preferred by scholars at all academic career stages from 1990-2000, with a share of around 20%. However, its r...

  18. [18]

    Certain methods, such as transaction log analysis and focus groups, appear prominently only in specific periods and academic age groups

    This shift may be linked to the rise of emerging technologies, such as machine learning models, which have increasingly been applied in academic papers, potentially displacing other traditional methods. Certain methods, such as transaction log analysis and focus groups, appear prominently only in specific periods and academic age groups. Overall, scholars...

  19. [19]

    Notably, bibliometrics, which had relatively low usage frequency from 1990 to 1995, experienced rapid growth starting in 1995 and maintained high usage frequency between 2010 and

    Evolutionary trends in the use of different research methods by scholars at different stages of their academic careers In papers published by scholars in the senior stage of their careers, the use of methods such as bibliometrics, content analysis, interview, and questionnaire exhibits a pronounced upward trend. Notably, bibliometrics, which had relativel...

  20. [20]

    This trend may be linked to the rapid development of scientometrics and the increasing emphasis on Bibliometric Analysis in academia. In contrast, the use of experiment and theoretical approach remains relatively stable, indicating that theoretical research continues to hold a significant position in academic inquiry. During the period of 2005–2010, metho...

  21. [21]

    An academic age of < 7 years indicates young scholars, 7–14 years indicates middle-aged scholars, and > 14 years indicates senior scholars

    Top Five Research Topics by Usage Proportion Across Academic Age Groups. An academic age of < 7 years indicates young scholars, 7–14 years indicates middle-aged scholars, and > 14 years indicates senior scholars. Integrated Preferences of Research Methods and Topics Among Different Academic Age Groups. To visualize the comprehensive preferences of scholar...

  22. [22]

    Figure 16 illustrates the evolving trends in the usage frequency of the 20 research topics among scholars at different career stages between 1990 and

    Trends in Research Topic Application Evolving Trends in Research Topic Usage Among Scholars at Different Career Stages. Figure 16 illustrates the evolving trends in the usage frequency of the 20 research topics among scholars at different career stages between 1990 and

  23. [23]

    From a thematic perspective, the application frequency of classical quantitative topics such as Scientometrics and Academic Productivity has gradually increased over time

    Overall, all topics exhibit significant fluctuations in usage frequency, reflecting the dynamic nature of research interests. From a thematic perspective, the application frequency of classical quantitative topics such as Scientometrics and Academic Productivity has gradually increased over time. Topics like Digital Information Resources and Library Polic...

  24. [24]

    Their strong resource integration capabilities enabled breakthroughs in technically complex topics. From the perspective of scholarly behavior, senior scholars maintain a sustained authority in mature topics such as Scientometrics, while also demonstrating strong adaptability in technically driven themes like social media/sentiment analysis. Although scho...

  25. [25]

    Therefore, this subsection focuses on scholars who published their first paper between 1970 and

    However, specific trends in the evolution may be obscured by factors such as the popularity of certain methods. Therefore, this subsection focuses on scholars who published their first paper between 1970 and

  26. [26]

    Aggregate evolution of research method usage in LIS scholars' academic careers

    This cohort was selected to minimize the generational effects of academic age differences on method usage and because scholars in this decade produced a higher volume of publications compared to other ten-year intervals, making them particularly valuable for analysis. Aggregate evolution of research method usage in LIS scholars' academic careers. Figure 1...

  27. [27]

    Evolution of research method usage among scholars at different career stages To better demonstrate this relationship, we created an interactive heatmap.3 This interactive graph collected data from the group of scholars whose earliest publication time was from 1970 to

  28. [28]

    It can dynamically display the changes in the research methods used by scholars each year as their academic age increases. At the bottom of 3 https://jiayihao-njust.github.io/evolution/Method%20Evolution.html the interactive graph, there is a "Pause" button, which allows users to pause at any time to view the evolution of research methods in the academic ...

  29. [29]

    Individual evolution of research method usage in scholars' academic careers

    The detailed information of the selected scholars for this graph can be found in Table A of Appendix. Individual evolution of research method usage in scholars' academic careers. To further explore the characteristics of research method usage in scholars' academic careers, this study randomly selects four senior scholars and conducts a detailed analysis o...

  30. [30]

    He has also utilized combined methods in his research, primarily pairing commonly used methods

    Throughout his academic career, he has employed four research methods, with the most frequently used being webometrics, content analysis, and bibliometrics. He has also utilized combined methods in his research, primarily pairing commonly used methods. During his early-career stage, his sole research method was webometrics. In his mid-career stage, his me...

  31. [31]

    It highlights innovative directions and the application of cutting-edge research methods within the LIS field, offering theoretical insights and guidance for disciplinary development and innovation. Practical Implications: Focusing on the individual scholar level, this research examines the distinct patterns and characteristics in research method usage am...

  32. [32]

    Beltagy, I., Lo, K., & Cohan, A. (2019). SciBERT: A pretrained language model for scientific text. In K. Inui, J. Jiang, V. Ng, & X. Wan (Eds), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 3615–3620). Association for Comp...

  33. [33]

    & Smith, E

    Hayman, R. & Smith, E. (2020). Mixed Methods Research in Library and Information Science: A Methodological Review. Evidence Based Library and Information Practice, 15(1), 106–125. Heting Chu & Qing Ke. (2017). Research methods: What’s in the name? Library & Information Science Research, 39(4), 284–294. Järvelin, K., & Vakkari, P. (1990). Content Analysis ...

  34. [34]

    Krauss, A. (2024). Science of science: A multidisciplinary field studying science. Heliyon, 10(17). Kumar, S., & Ratnavelu, K. (2016). Perceptions of Scholars in the Field of Economics on Co-Authorship Associations: Evidence from an International Survey. PLOS ONE, 11(6), e0157633. Liang, G., Hou, H., Ding, Y., & Hu, Z. (2020). Knowledge recency to the bir...

  35. [35]

    Milojević, S

    Journal of Documentation, 77(5), 1196–1208. Milojević, S. (2012). How Are Academic Age, Productivity and Collaboration Related to Citing Behavior of Researchers? PLoS ONE, 7(11), e49176. Nane, G. F., Larivière, V., & Costas, R. (2017). Predicting the age of researchers using bibliometric data. Journal of Informetrics, 11(3), 713–729. Packalen, M., & Bhatt...

  36. [36]

    van den Besselaar, P., & Sandström, U. (2016). Gender differences in research performance and its impact on careers: A longitudinal case study. Scientometrics, 106, 143–162. Wang, W., Yu, S., Bekele, T. M., Kong, X., & Xia, F. (2017). Scientific collaboration patterns vary with scholars’ academic ages. Scientometrics, 112(1), 329–343. Zeng, A., Shen, Z., ...

  37. [37]

    Zhang, C., & Tian, L. (2023). Non-synchronism in global usage of research methods in library and information science from 1990 to

  38. [38]

    Zhang, C., Tian, L., & Chu, H

    Scientometrics, 128, 3981–4006. Zhang, C., Tian, L., & Chu, H. (2023). Usage frequency and application variety of research methods in library and information science: Continuous investigation from 1991 to

  39. [39]

    Zhang, C., Zeng, J., & Zhao, Y

    Information Processing & Management, 60(6), 103507. Zhang, C., Zeng, J., & Zhao, Y. (2025). Is higher team gender diversity correlated with better scientific impact? Journal of Informetrics, 19(2), 101662. Zhang, L., Qi, F., Sivertsen, G., Liang, L., & Campbell, D. (2024). Gender differences in the patterns and consequences of changing research directions...