Recognition: unknown
Citation Farming on ResearchGate: Blatant and Effective
Pith reviewed 2026-05-10 11:48 UTC · model grok-4.3
The pith
Papers from suspected boosting accounts on ResearchGate form clusters with identical reference lists that disproportionately cite certain authors.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is that equal references groups function as an interpretable structural signal for coordinated or automated citation boosting. In the analyzed collection many papers belong to such groups, resulting in disproportionate citation to a small set of authors, and a substantial share of some authors' citations can be traced directly to these suspicious clusters rather than to independent citing papers.
What carries the argument
Equal references groups: clusters of papers that share identical reference lists, serving as the detectable pattern that distinguishes coordinated boosting from normal citation behavior.
If this is right
- Citation counts for certain authors can be substantially inflated by citations originating inside these coordinated clusters.
- The equal references motif offers a concrete, checkable indicator that can be applied to other citation networks to flag possible boosting.
- Author-level impact metrics on ResearchGate become less reliable when a measurable fraction of citations arrives through such groups.
- The observed pattern demonstrates that boosting services can alter citation distributions at scale on the platform.
Where Pith is reading between the lines
- The same structural signal could be tested on citation data from other platforms to estimate how widespread coordinated boosting has become.
- Evaluations that rely on raw citation totals may benefit from filtering or weighting papers that belong to equal references clusters.
- Longitudinal tracking of authors who receive many citations from these groups could reveal whether the boosted metrics persist after the activity stops.
Load-bearing premise
The five accounts are correctly identified as boosting-service providers and identical reference lists primarily indicate coordination rather than legitimate reuse such as templates or narrow subfield conventions.
What would settle it
A large sample of papers known to be authored independently and without boosting involvement would need to contain equal reference list clusters at rates comparable to the studied collection.
Figures
read the original abstract
We investigate platform-native citation farming on ResearchGate by analyzing almost 3000 papers uploaded by five suspected boosting-service provider accounts. From the uploaded papers and associated metadata, we construct both paper-level and author-level citation networks. We introduce an interpretable structural signal for coordinated boosting, equal references groups: clusters of papers with equal reference lists. We find that many papers from our collection exhibit this motif, that is, they disproportionately cite a small set of authors, consistent with coordinated or automated boosting rather than independent scholarly practice. Finally, we show that for some authors in our dataset a substantial share of their citations can be attributed to these suspicious groups. A different citation network was used to validate the rareness of such motifs in legitimate scientific work.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper analyzes nearly 3000 papers uploaded by five suspected boosting-service provider accounts on ResearchGate. It constructs paper- and author-level citation networks from the uploaded papers and metadata, introduces 'equal references groups' (clusters of papers sharing identical reference lists) as an interpretable signal of coordinated boosting, reports that many papers in the collection exhibit this motif and disproportionately cite a small set of authors, shows that a substantial share of citations to some authors in the dataset can be attributed to these groups, and validates the motif's rarity via comparison to a separate legitimate citation network.
Significance. If the account identification and motif specificity hold, the work supplies direct empirical evidence of platform-native citation farming on ResearchGate, illustrating how automated or coordinated uploading can inflate author-level metrics. The network construction from actual uploaded papers and the contrast with a control network provide concrete, replicable support for detecting such activity, which could inform platform moderation and studies of citation integrity.
major comments (3)
- [Methods] Methods (account selection): The five accounts are labeled 'suspected' boosting-service providers, yet the specific criteria, metadata patterns, or evidence used to identify them are described only at a high level. This is load-bearing because the full dataset, motif analysis, and citation-attribution claims rest on these accounts being the source of the farming activity rather than ordinary uploads.
- [Validation] Validation section: The separate citation network used to establish that equal-references motifs are rare in legitimate work is not described as matched on paper genre, methodological overlap, or narrow subfields (e.g., data-descriptor or protocol papers where reference-list reuse is common). Without such controls, the motif's specificity as a signal of coordination versus legitimate reuse is not fully demonstrated.
- [Results] Results (disproportionate citation): The claim that papers in equal-references groups 'disproportionately cite a small set of authors' requires an explicit quantitative baseline (e.g., comparison against the degree distribution or a null model in the constructed network) to support the interpretation of coordinated boosting over independent scholarly practice.
minor comments (2)
- [Abstract] Abstract: The phrase 'a different citation network' should specify its approximate size, source, or construction method to give readers immediate context for the validation step.
- [Introduction] Terminology: Ensure 'equal references groups' is defined with precise operational criteria (exact string match on reference lists, or allowing minor formatting variations) at first use and applied consistently.
Simulated Author's Rebuttal
We thank the referee for their thorough review and valuable feedback on our manuscript. We have carefully considered each major comment and provide point-by-point responses below, along with planned revisions to address the concerns raised.
read point-by-point responses
-
Referee: [Methods] Methods (account selection): The five accounts are labeled 'suspected' boosting-service provider accounts, yet the specific criteria, metadata patterns, or evidence used to identify them are described only at a high level. This is load-bearing because the full dataset, motif analysis, and citation-attribution claims rest on these accounts being the source of the farming activity rather than ordinary uploads.
Authors: We agree that additional detail on the account selection process would improve the transparency of the study. In the revised manuscript, we will expand the Methods section to provide more specific information on the metadata patterns and observable behaviors that led us to suspect these accounts of providing boosting services. We note that complete disclosure of all identification heuristics is balanced against the risk of enabling further gaming of the platform, but we will include sufficient detail to allow readers to understand the basis for our selection. This addresses the load-bearing nature of the identification while maintaining the integrity of the analysis. revision: yes
-
Referee: [Validation] Validation section: The separate citation network used to establish that equal-references motifs are rare in legitimate work is not described as matched on paper genre, methodological overlap, or narrow subfields (e.g., data-descriptor or protocol papers where reference-list reuse is common). Without such controls, the motif's specificity as a signal of coordination versus legitimate reuse is not fully demonstrated.
Authors: The referee correctly identifies a limitation in our validation approach. The control network was selected as a broad sample of legitimate scientific work from a different platform or corpus, but it was not explicitly matched for genre or subfield. In the revision, we will add a dedicated paragraph in the Validation section discussing this choice, including why we believe the motif remains rare even across varied fields, and we will explore the possibility of adding a more targeted control set if feasible with available data. We maintain that the stark contrast observed supports the motif as a strong signal, but we will qualify the interpretation accordingly. revision: partial
-
Referee: [Results] Results (disproportionate citation): The claim that papers in equal-references groups 'disproportionately cite a small set of authors' requires an explicit quantitative baseline (e.g., comparison against the degree distribution or a null model in the constructed network) to support the interpretation of coordinated boosting over independent scholarly practice.
Authors: We appreciate this suggestion for strengthening the quantitative support. In the revised Results section, we will include an explicit comparison of the citation patterns in equal-references groups against the overall degree distribution in the author-level citation network. Additionally, we will describe a simple null model (e.g., random citation assignment preserving the number of references) to demonstrate that the observed concentration on a small set of authors is statistically unlikely under independent practice. This will provide the requested baseline and bolster the interpretation of coordinated activity. revision: yes
Circularity Check
No circularity: direct network construction and motif counting from source data
full rationale
The paper constructs paper- and author-level citation networks directly from the ~3000 uploaded papers and metadata, defines equal-references groups as clusters sharing identical reference lists, counts their occurrence, and compares motif frequency against a separate control citation network. No equations, fitted parameters, or derivations are present that reduce a claimed result to the inputs by construction. The control network serves as external validation rather than a self-referential step. No self-citations are invoked as load-bearing premises, and the analysis does not rename or smuggle in prior results via ansatz. This is a standard empirical network study whose central claims rest on observable counts rather than definitional equivalence.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The five accounts are suspected boosting-service provider accounts
- domain assumption Equal reference lists indicate coordinated or automated boosting rather than independent scholarly practice
invented entities (1)
-
equal references groups
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Bea Aubert, R Barate, D Boutigny, F Couderc, Y Karyotakis, JP Lees, V Poireau, V Tisserand, A Zghiche, E Grauges, et al . 2007. Measurement of Branching Fractions and Mass Spectra of B→ K 𝜋 𝜋 𝛾 .Physical review letters98, 21 (2007), 211804
2007
-
[2]
Renata Avros, Saar Keshet, Dvora Toledano Kitai, Evgeny Vexler, and Zeev Volkovich. 2023. Detecting Pseudo-Manipulated Citations in Scientific Liter- ature through Perturbations of the Citation Graph.Mathematics11, 18 (2023),
2023
-
[3]
doi:10.3390/math11183820
-
[4]
Johannes Gehrke, Paul Ginsparg, and Jon Kleinberg. 2003. Overview of the 2003 KDD Cup.Acm Sigkdd Explorations Newsletter5, 2 (2003), 149–151
2003
-
[5]
Hazem Ibrahim, Fengyuan Liu, Yasir Zaki, and Talal Rahwan. 2025. Citation manipulation through citation mills and pre-print servers. 15, 1 (2025), 5480. doi:10.1038/s41598-025-88709-7
-
[6]
Savina Kirilova and Fred Zoepfl. 2025. Metrics fraud on ResearchGate.Journal of Informetrics19, 1 (2025), 101604. doi:10.1016/j.joi.2024.101604
-
[7]
Weixin Liang, Yaohui Zhang, Zhengxuan Wu, Haley Lepp, Wenlong Ji, Xuandong Zhao, Hancheng Cao, Sheng Liu, Siyu He, Zhi Huang, Diyi Yang, Christopher Potts, Christopher D Manning, and James Y. Zou. 2024. Mapping the Increasing Use of LLMs in Scientific Papers. InFirst Conference on Language Modeling. https://openreview.net/forum?id=YX7QnhxESU
2024
- [8]
-
[9]
Birgitte Nørgaard, Karen E. Lie, and Hans Lund. 2026. Predictors of citation rates and the problem of citation bias: a scoping review. 190 (2026), 112057. doi:10.1016/j.jclinepi.2025.112057
-
[10]
2026.Suspected Citation Boosting Network on ResearchGate [Data set]
Benedikt Wotka, Bennett Daniel, Cenk Erdoğan, Ashish Sai, and Adriana Iamnitchi. 2026.Suspected Citation Boosting Network on ResearchGate [Data set]. doi:10.5281/zenodo.19328245
-
[11]
Jonathan D. Wren and Constantin Georgescu. 2022. Detecting anomalous ref- erencing patterns in PubMed papers suggestive of author-centric reference list manipulation.Scientometrics127, 10 (2022), 5753–5771. doi:10.1007/s11192-022- 04503-6
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.