arxiv: 2605.11930 · v1 · submitted 2026-05-12 · 💻 cs.DL · cs.SI

Recognition: no theorem link

Citation Cliques in Low Impact Journals

Panagiotis-Alexios Spanakis , Grigorios Alexandrou , Diomidis Spinellis

Authors on Pith no claims yet

Pith reviewed 2026-05-13 04:09 UTC · model grok-4.3

classification 💻 cs.DL cs.SI

keywords citation networkslow-impact journalsauthor cohesionbibliometricscitation cliquesreciprocityEigenfactorcitation economies

0 comments

The pith

Authors in low-impact journals cite each other at 6.7 times the rate and with 4.7 times the reciprocity of matched authors in high-impact venues.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper matches authors from low- and high-impact journals on subject area and h5-index, then compares their citation patterns using Crossref data. It finds markedly denser and more reciprocal author-to-author citations within the low-impact group, including cliques that form closed citation economies. A reader should care because these patterns can inflate bibliometric indicators and create a segregated citation landscape between high- and low-visibility venues.

Core claim

Using subject-normalized Eigenfactor percentiles to label venues and a 10 percent sample of 9,431 matched author pairs, the study shows low-impact authors exhibit 6.7 times higher co-author citation rates and 4.7 times higher reciprocity than high-impact controls. A hybrid detection pipeline isolates 277 outliers with 93.5 percent low-impact purity that display an 11-fold clique-strength increase, revealing a Two Worlds segregation where low-impact venues operate as inward-looking citation networks rather than participating in open exchange.

What carries the argument

Author matching by subject area and h5-index, followed by aggregate comparison of citation cohesion metrics (co-author citation rates and reciprocity) and a subject-aware outlier detection pipeline that identifies cliques and their hub-and-spoke topologies.

If this is right

Low-impact venues sustain segregated citation economies that inflate their own bibliometric scores.
Cohesion, rather than one-way asymmetry, is the dominant driver of the observed Case-Control gap.
Outlier cliques display directed flows from peripheral authors toward central beneficiaries, not equal exchange.
The Two Worlds pattern (correlation 0.71) implies citation-based ranking systems systematically misrepresent influence across impact strata.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Citation-based evaluation of researchers who publish mainly in low-impact venues may need separate normalization to avoid over- or under-counting due to local reciprocity.
The same matching-and-cohesion approach could be applied to conference proceedings or preprint servers to test whether the pattern generalizes beyond journals.
If the closed economies persist over time, they could widen the visibility gap between high- and low-impact communities even when underlying research quality is comparable.

Load-bearing premise

Matching authors solely by subject area and h5-index fully removes confounding differences between the low- and high-impact groups, so that observed cohesion gaps can be attributed to venue impact level.

What would settle it

A re-analysis that adds further author-level controls (such as career length, institutional prestige, or total publication count) and finds the cohesion and reciprocity differences disappear or reverse.

Figures

Figures reproduced from arXiv: 2605.11930 by Diomidis Spinellis, Grigorios Alexandrou, Panagiotis-Alexios Spanakis.

**Figure 2.** Figure 2: Co-author citation gap by subject field. Forest plot of mean paired difference ∆ = ¯rCase − r¯Control in co-author citation rate, with 95% confidence intervals. Marker • = field estimate; ♦ = overall. ∆ labels report rounded effect sizes. The dashed vertical line marks zero (no difference); the pale red shading covers the positive region where Case authors exceed Controls [PITH_FULL_IMAGE:figures/full_fig… view at source ↗

**Figure 3.** Figure 3: , where cohesion and concentration metrics rank as the most discriminative features for tier classification, while authority metrics such as eigenvector centrality play a secondary role. 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Mean decrease in impurity Local Clustering Incoming Entropy Incoming HHI Norm. Triangles Clique Strength K-Core Number Reciprocity Rate Eigenvector Centr. Co-author Cit. Rate Self-Citatio… view at source ↗

**Figure 4.** Figure 4: Subject-Specific Effect Sizes (Cliff’s δ). Each cell reports the effect size for a given metric within a subject area. Positive values (white–pale red) indicate that Cases exhibit higher values than Controls. Negative values (blue) indicate that Controls exhibit higher values than Cases. Co-author Citation Rate shows near-zero or slightly positive values across all subjects, while velocity and burstiness a… view at source ↗

**Figure 5.** Figure 5: Tier separability: LDA projection. Kernel density estimates of the scores produced by projecting each author’s standardised feature vector onto the single Linear Discriminant Analysis (LDA) axis that maximally separates Case from Control authors. Dotted vertical lines mark within-tier medians. Negative scores correspond to the cohesion-driven regime (higher co-author citation, clustering, and reciprocity),… view at source ↗

**Figure 6.** Figure 6: Outlier behavioural fingerprint. Radar chart of outlier fold-change ratios (log-scaled radial axis) relative to non-outlier authors across six citation-behaviour metrics. Values on each spoke give the outlier-to-normal mean ratio; concentric rings mark 1×, 10×, and 100× baselines. The dashed inner ring corresponds to 1× (no difference). 17 [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗

**Figure 7.** Figure 7: Largest outlier citation syndicate (n = 23). Force-directed layout of the internal directed citation network. Node size is proportional to betweenness centrality; salmon = hub (highest betweenness); pink = net giver (out-degree > in-degree); cyan = net receiver. Arrows indicate the direction of citation flow; edge width scales with the number of citations between two authors. The inset box reports node cou… view at source ↗

**Figure 8.** Figure 8: Citation mixing matrix (r = 0.71, Q = 0.97; strong homophily). Row-normalised proportion of citations directed within and across tiers. Cell values show the conditional probability that an author in the row tier cites an author in the column tier. The Brewer RdBu colormap is centred at 0.5. Below: diagonal average (84.0%) summarises overall within-tier preference. 4.7 Cross-referencing Case Authors To move… view at source ↗

read the original abstract

This exploratory study examines how low-impact journals, defined through subject-normalized Eigenfactor percentiles, are associated with denser and more reciprocating patterns of author-to-author citations. Using Crossref records, we assign journals to broad subject areas, compute subject-specific Eigenfactor scores, propagate venue quality to works and authors, match authors in low- (Case) versus high-influence (Control) venues by subject and h5, and analyze citation edges for cohesion and anomalies. Across a 10% sample of 9,431 matched pairs, authors in low-impact venues exhibit significantly higher cohesion: 6.7x higher co-author citation rates and 4.7x higher reciprocity in the aggregate Case-Control comparison. A subject-aware hybrid detection pipeline flags 277 outliers with 93.5% Case purity; these outliers display an 11x clique-strength lift relative to non-outliers, revealing a stark "Two Worlds" segregation (r = 0.71) where low-impact venues operate as closed citation economies. The largest detected component (n = 23) displays a hub-and-spoke topology in which peripheral "Sycophants" funnel citations to central "Beneficiaries" through coordinated bursts, confirming a directed flow imbalance rather than reciprocal exchange among equals. Overall, cohesion, rather than broad asymmetry, accounts for the main Case-Control differences, suggesting that low-impact venues foster segregated, inward-looking citation economies that distort bibliometric indicators.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reports 6.7x higher co-author citation rates and 4.7x reciprocity in low-impact venues after matching, but the h5-index match leaves volume and team-size differences unaddressed.

read the letter

The main result is that authors tied to low-impact journals show markedly denser internal citation patterns than matched controls in high-impact ones. Across the 9,431 pairs they draw 6.7 times more citations among co-authors and 4.7 times more reciprocal edges, and their outlier pipeline isolates a small set of cliques that are almost all low-impact. The hub-and-spoke structure in the biggest component gives a concrete illustration of how citations can flow inward rather than spread out. That scale and the subject-aware matching are the parts that feel new relative to earlier citation-network studies. The data source is Crossref plus Eigenfactor, which is transparent and reproducible in principle, and the purity rate on the flagged outliers (93.5 percent Case) is a clean number to evaluate. The work is therefore useful for anyone who uses journal-level metrics in hiring or funding decisions and wants to see one possible source of distortion. The matching step is the clearest soft spot. Holding subject and h5-index fixed does not equalize total output volume or average collaboration size, so the low-impact group could simply generate more internal citation opportunities through higher productivity alone. Without checks on those variables the reported multipliers are harder to attribute cleanly to venue type. The abstract also omits error bars and the exact statistical procedure, which makes it difficult to judge precision. The circularity risk is moderate but acknowledged by their use of external scores. Readers who care about bibliometric robustness will find the comparison worth examining even if the causal claim needs tightening. I would send it to peer review with targeted requests for volume-balance diagnostics and clearer reporting of the detection pipeline.

Referee Report

2 major / 1 minor

Summary. This exploratory study uses Crossref data to examine citation cohesion in low-impact journals (defined via subject-normalized Eigenfactor percentiles). Authors from low-impact (Case) and high-impact (Control) venues are matched by subject area and h5-index; across a 10% sample of 9,431 pairs, the analysis reports 6.7x higher co-author citation rates and 4.7x higher reciprocity in the Case group. A subject-aware hybrid detection pipeline identifies 277 outliers (93.5% Case purity) with 11x higher clique strength, revealing a 'Two Worlds' segregation (r=0.71) and a hub-and-spoke topology in the largest component (n=23) involving 'Sycophants' and 'Beneficiaries'. The central claim is that low-impact venues operate as closed, inward-looking citation economies that distort bibliometric indicators.

Significance. If the quantitative differences hold after addressing confounding and validation gaps, the work would demonstrate that venue impact level correlates with citation segregation at scale, with implications for the reliability of metrics like Eigenfactor. Strengths include the large matched sample (9,431 pairs), concrete multipliers, and the hybrid outlier pipeline that achieves high reported purity; these provide falsifiable, data-driven claims rather than purely theoretical assertions.

major comments (2)

[Abstract] Abstract (author matching procedure): Matching solely by subject area and h5-index does not control for publication volume or mean collaboration/team size. Authors with identical h5-index can differ substantially in total output, creating more opportunities for internal citations within the Case group independent of venue segregation. This is load-bearing for the central claim, as the 6.7x co-author citation rate and 4.7x reciprocity differences cannot be isolated to venue impact level without balancing these factors.
[Abstract] Abstract (results and pipeline description): The reported multipliers (6.7x, 4.7x), 93.5% Case purity, and r=0.71 correlation lack error bars, confidence intervals, exact statistical tests, or p-values. The hybrid detection pipeline's validation (false-positive rates, ground-truth comparison) is not described, leaving the outlier identification and 'Two Worlds' segregation claim only moderately supported despite the large sample.

minor comments (1)

[Abstract] The terms 'Sycophants' and 'Beneficiaries' are introduced in the abstract without operational definitions or explicit criteria for assignment in the hub-and-spoke component; this reduces clarity for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments on our exploratory study. We address each major comment in detail below, providing our responses and indicating the revisions we will make to the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract (author matching procedure): Matching solely by subject area and h5-index does not control for publication volume or mean collaboration/team size. Authors with identical h5-index can differ substantially in total output, creating more opportunities for internal citations within the Case group independent of venue segregation. This is load-bearing for the central claim, as the 6.7x co-author citation rate and 4.7x reciprocity differences cannot be isolated to venue impact level without balancing these factors.

Authors: We recognize that additional controls for publication volume and team size could further isolate the effect of venue impact. Although h5-index provides a partial proxy for author productivity, we agree this is a valid concern for the central claim. In the revised version, we will augment the author matching procedure to also match on total publication count and average collaboration size. We will then recompute the co-author citation rates and reciprocity metrics under this stricter matching and report any changes to the 6.7x and 4.7x multipliers. revision: yes
Referee: [Abstract] Abstract (results and pipeline description): The reported multipliers (6.7x, 4.7x), 93.5% Case purity, and r=0.71 correlation lack error bars, confidence intervals, exact statistical tests, or p-values. The hybrid detection pipeline's validation (false-positive rates, ground-truth comparison) is not described, leaving the outlier identification and 'Two Worlds' segregation claim only moderately supported despite the large sample.

Authors: We agree that the abstract and associated claims would benefit from explicit statistical support and pipeline validation details. In the revised manuscript, we will include bootstrap confidence intervals for the reported multipliers and the correlation coefficient, along with the results of permutation tests for significance. Additionally, we will expand the methods section to describe the validation of the hybrid detection pipeline, including false-positive rates from ground-truth comparisons and the cross-validation procedure used to obtain the 93.5% purity figure. The abstract will be updated to reference these statistical measures. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper performs an observational empirical analysis on external Crossref citation data, using standard subject-normalized Eigenfactor percentiles to classify low- versus high-impact venues and then computing separate cohesion metrics (co-author citation rates, reciprocity) on the same graph. No load-bearing step reduces a claimed result to a tautology, fitted parameter, or self-citation chain by construction. The matching by subject and h5-index, outlier detection, and aggregate Case-Control comparisons are direct data-driven procedures without self-definitional loops or renamed known results. The analysis is self-contained against external benchmarks and does not invoke any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 1 invented entities

The central claims rest on the assumption that Eigenfactor-based classification and h5/subject matching isolate venue effects, plus post-hoc labeling of network roles; no independent verification of these steps is provided in the abstract.

free parameters (2)

low-impact threshold
Subject-normalized Eigenfactor percentiles used to assign journals to Case group
sample fraction
10% sample of 9,431 matched pairs selected for analysis

axioms (2)

domain assumption Eigenfactor scores and h5-index provide valid proxies for journal influence and author productivity respectively
Used to define Case/Control groups and perform matching
ad hoc to paper The hybrid detection pipeline accurately identifies citation cliques without excessive false positives
Invoked to flag 277 outliers with reported 93.5% Case purity

invented entities (1)

Sycophants and Beneficiaries no independent evidence
purpose: Descriptive labels for peripheral and central nodes in the largest detected component
Assigned to describe hub-and-spoke topology; no independent evidence provided

pith-pipeline@v0.9.0 · 5564 in / 1446 out tokens · 97236 ms · 2026-05-13T04:09:00.228433+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages

[1]

Gaming the Metrics: Misconduct and Manipulation in Academic Research , year =

work page
[2]

2025 , note =

Shumin Qiu and Claudia Steinwender and Pierre Azoulay , title =. 2025 , note =

work page 2025
[3]

Journal of the Association for Information Science and Technology , year =

Lonni Besançon and Guillaume Cabanac and others , title =. Journal of the Association for Information Science and Technology , year =

work page
[4]

Iakovos Evdaimon and John P. A. Ioannidis and others , title =. arXiv Cornell University , year =

work page
[5]

Stamp out paper mills , journal =

Anna Abalkina and Ren. Stamp out paper mills , journal =. 2025 , volume =

work page 2025
[6]

Bergstrom and Jevin D

Carl T. Bergstrom and Jevin D. West and Marc A. Wiseman , title =. Journal of Neuroscience , year =

work page
[7]

PLoS ONE , year =

Spinellis, Diomidis , title =. PLoS ONE , year =

work page
[8]

IEEE Software , volume =

Diomidis Spinellis , title =. IEEE Software , volume =. 2024 , doi =

work page 2024
[9]

Biometrics Bulletin , year =

Frank Wilcoxon , title =. Biometrics Bulletin , year =

work page
[10]

Journal of the Royal Statistical Society: Series B , year =

Yoav Benjamini and Yosef Hochberg , title =. Journal of the Royal Statistical Society: Series B , year =

work page
[11]

Tibshirani , title =

Bradley Efron and Robert J. Tibshirani , title =

work page
[12]

Jacob Cohen , title =

work page
[13]

Norman Cliff , title =

work page
[14]

Proceedings of the 8th IEEE International Conference on Data Mining , year =

Fei Tony Liu and Kai Ming Ting and Zhi-Hua Zhou , title =. Proceedings of the 8th IEEE International Conference on Data Mining , year =

work page
[15]

Blondel and Jean-Loup Guillaume and Renaud Lambiotte and Etienne Lefebvre , title =

Vincent D. Blondel and Jean-Loup Guillaume and Renaud Lambiotte and Etienne Lefebvre , title =. Journal of Statistical Mechanics: Theory and Experiment , year =

work page
[16]

Nature , year =

Miryam Naddaf , title =. Nature , year =

work page
[17]

Artificial Intelligence Review , volume =

Anomalous citations detection in academic networks , author =. Artificial Intelligence Review , volume =. 2024 , doi =

work page 2024
[18]

Mark E. J. Newman , title =. Physical Review Letters , year =

work page
[19]

Breunig and Hans-Peter Kriegel and Raymond T

Markus M. Breunig and Hans-Peter Kriegel and Raymond T. Ng and J. Proceedings of the 2000. 2000 , pages =

work page 2000
[20]

Nature , year =

Jeffrey Beall , title =. Nature , year =

work page
[21]

Mark E. J. Newman and Michelle Girvan , title =. Physical Review E , year =

work page
[22]

Watts and Steven H

Duncan J. Watts and Steven H. Strogatz , title =. Nature , year =

work page
[23]

Science , year =

Eugene Garfield , title =. Science , year =

work page
[24]

2026 , publisher =

Spanakis, Panagiotis-Alexios and Alexandrou, Grigorios and Spinellis, Diomidis , title =. 2026 , publisher =. doi:10.5281/zenodo.19786937 , url =

work page doi:10.5281/zenodo.19786937 2026
[25]

PLoS ONE , year =

Heneberg, Petr , title =. PLoS ONE , year =

work page
[26]

Hirsch , title =

Jorge E. Hirsch , title =. Proceedings of the National Academy of Sciences , year =

work page
[27]

Burnham , title =

Judith F. Burnham , title =. Biomedical Digital Libraries , year =

work page
[28]

Goodhart, C. A. E. , year =. Problems of Monetary Management: The. doi:10.1007/978-1-349-17295-5_4 , booktitle =

work page doi:10.1007/978-1-349-17295-5_4
[29]

False Authorship: An Explorative Case Study Around an

Spinellis, Diomidis , year =. False Authorship: An Explorative Case Study Around an. Research Integrity and Peer Review , publisher =. doi:10.1186/s41073-025-00165-z , number =

work page doi:10.1186/s41073-025-00165-z
[30]

Engineering Data Processing Workflows , year =

Diomidis Spinellis , journal =. Engineering Data Processing Workflows , year =. doi:10.1109/MS.2024.3385665 , url =

work page doi:10.1109/ms.2024.3385665 2024
[31]

Ioannidis, John P. A. and Collins, Thomas A. and Baas, Jeroen , title =. Scientometrics , year =

work page
[32]

2021 , volume =

Christopher, Jana , title =. 2021 , volume =

work page 2021
[33]

Ioannidis, John P. A. , title =. 2025 , volume =

work page 2025
[34]

Toward the Discovery of Citation Cartels in Citation Networks , journal =

Iztok. Toward the Discovery of Citation Cartels in Citation Networks , journal =. 2016 , volume =

work page 2016
[35]

Citation gaming induced by bibliometric evaluation:

Alberto Baccini and Giuseppe. Citation gaming induced by bibliometric evaluation:. 2019 , volume =

work page 2019
[36]

Guerrero-Bote and Félix Moya-Aneg

Borja González-Pereira and Vicente P. Guerrero-Bote and Félix Moya-Aneg. A new approach to the metric of journals' scientific prestige:. Journal of Informetrics , year =

work page
[37]

Bibliometrics:

Diana Hicks and Paul Wouters and Ludo Waltman and Sarah. Bibliometrics:. Nature , year =

work page
[38]

Hundreds of extreme self-citing scientists revealed in new database , journal =

Richard. Hundreds of extreme self-citing scientists revealed in new database , journal =. 2019 , volume =

work page 2019
[39]

Wilhite and Eric A

Alan W. Wilhite and Eric A. Fong , title =. Science , year =

work page
[40]

Scientific Reports , year =

Sadamori Kojaku and Giacomo Livan and Naoki Masuda , title =. Scientific Reports , year =

work page
[41]

Science , year =

Michele Catanzaro , title =. Science , year =

work page
[42]

Martin and John P

Mario Biagioli and Martin Kenney and Ben R. Martin and John P. Walsh , title =. Research Policy , year =

work page
[43]

Journal of the American Medical Association , volume=

The History and Meaning of the Journal Impact Factor , author=. Journal of the American Medical Association , volume=. 2006 , publisher=

work page 2006
[44]

Journal of Informetrics , year =

Antonio Perianes-Rodriguez and Ludo Waltman and Nees Jan van Eck , title =. Journal of Informetrics , year =

work page
[45]

John P. A. Ioannidis and Jeroen Baas and Richard Klavans and Kevin W. Boyack , title =. 2019 , volume =

work page 2019
[46]

1999 , number =

Page, Lawrence and Brin, Sergey and Motwani, Rajeev and Winograd, Terry , title =. 1999 , number =

work page 1999
[47]

, title =

Hirschman, Albert O. , title =. The American Economic Review , year =

work page
[48]

, title =

Shannon, Claude E. , title =. The Bell System Technical Journal , year =

work page
[49]

, title =

Seidman, Stephen B. , title =. Social Networks , year =

work page
[50]

Data Mining and Knowledge Discovery , year =

Kleinberg, Jon , title =. Data Mining and Knowledge Discovery , year =

work page
[51]

, title =

Aksnes, Dag W. , title =. Scientometrics , year =

work page
[52]

American Journal of Sociology , volume =

Bonacich, Phillip , title =. American Journal of Sociology , volume =. 1987 , doi =

work page 1987