pith. machine review for the scientific record. sign in

arxiv: 2604.18565 · v1 · submitted 2026-04-20 · 💻 cs.SI

Recognition: unknown

Detectability of minority communities in networks

Jiaze Li, Leto Peel

Pith reviewed 2026-05-10 03:00 UTC · model grok-4.3

classification 💻 cs.SI
keywords community detectionstochastic block modelminority communitiesphase transitionsdetectabilitybelief propagationspectral clustering
0
0 comments X

The pith

Minority communities in networks become detectable in three successive phases as signal strength increases in the stochastic block model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how small minority communities can be detected in networks that follow the stochastic block model with uneven group sizes. It identifies three phases: an early stage where the broad community layout appears but small groups blend into larger ones, a middle stage where the small groups stand apart from the rest but not from each other, and a late stage where each small group is resolved on its own. These stages line up with specific thresholds set by the connection patterns in the model. This matters because many real networks contain such small groups that shape overall organization, yet common detection tools often overlook them until the signal is quite strong.

Core claim

In the stochastic block model with heterogeneous community sizes, community detection proceeds through three phases: the detectable phase where overall structure is recovered but minority communities merge with majority ones; the distinguishable phase where minority communities form a coherent group separate from the majority but remain unresolved internally; and the resolvable phase where each minority community is fully distinguishable. These phases correspond to phase transitions at the Kesten-Stigum threshold and two additional thresholds determined by the eigenvalue structure of the signal matrix. Spectral clustering with the Bethe Hessian exhibits significantly weaker detection of the细

What carries the argument

The eigenvalue structure of the signal matrix, which fixes the locations of the two extra phase transitions that separate the distinguishable and resolvable regimes for minority communities.

Load-bearing premise

The network is generated exactly according to a stochastic block model with heterogeneous community sizes whose connection probabilities permit explicit derivation of the signal-matrix eigenvalues and phase boundaries.

What would settle it

A network drawn from the model in which the observed boundaries for distinguishing and resolving minority communities do not match the thresholds predicted from the eigenvalues of its signal matrix, or in which Bethe-Hessian spectral clustering performs as well as belief propagation on the minority groups.

Figures

Figures reproduced from arXiv: 2604.18565 by Jiaze Li, Leto Peel.

Figure 1
Figure 1. Figure 1: FIG. 1 [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: schematically illustrates the three phases for a network with qs = 2 minority and qb = 2 majority communities: panel (a) shows the detectable phase (q = 2), where minorities are indistinguishable from ma￾jorities; panel (b) shows the distinguishable phase (q = 3), where minorities form a coherent group sep￾arate from majorities but remain unresolved internally; panel (c) shows the resolvable phase (q = 4),… view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3 [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: FIG. 5 [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: FIG. 6 [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: FIG. 7 [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: FIG. 9 [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Visual inspection clearly suggests that f min q reaches a minimum at q = 4 and then levels off, indi￾cating q ∗ = 4. However, it is challenging to determine an appropriate precision for comparing f min q values. Mi￾nor differences in f min q beyond the q ∗ = 4 can affect the selection of the final number of communities. In our ex￾perimental setup, after reviewing free energy plots across several (ρ, δ) po… view at source ↗
Figure 11
Figure 11. Figure 11: FIG. 11 [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: FIG. 12 [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗
Figure 14
Figure 14. Figure 14: FIG. 14 [PITH_FULL_IMAGE:figures/full_fig_p020_14.png] view at source ↗
Figure 16
Figure 16. Figure 16: FIG. 16 [PITH_FULL_IMAGE:figures/full_fig_p021_16.png] view at source ↗
read the original abstract

Community structure is prevalent in real-world networks, with empirical studies revealing heterogeneous distributions where a few dominant majority communities coexist with many smaller groups. These small-scale groups, which we term minority communities, are critical for understanding network organization but pose significant challenges for detection. Here, we investigate the detectability of minority communities from a theoretical perspective using the Stochastic Block Model. We identify three distinct phases of community detection: the detectable phase, where overall community structure is recoverable but minority communities are merged into majority groups; the distinguishable phase, where minority communities form a coherent group separate from the majority but remain unresolved internally; and the resolvable phase, where each minority community is fully distinguishable. These phases correspond to phase transitions at the Kesten-Stigum threshold and two additional thresholds determined by the eigenvalue structure of the signal matrix, which we derive explicitly. Furthermore, we demonstrate that spectral clustering with the Bethe Hessian exhibits significantly weaker detection performance for minority communities compared to belief propagation, revealing a specific limitation of spectral methods in identifying fine-grained community structure despite their capability to detect macroscopic structures down to the theoretical limit.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript analyzes the detectability of minority communities in the stochastic block model with heterogeneous community sizes. It identifies three phases of detection: the detectable phase where overall structure is recoverable but minorities are merged, the distinguishable phase where minorities form a separate group but are not resolved internally, and the resolvable phase where each is fully distinguishable. These phases are linked to the Kesten-Stigum threshold and two additional thresholds derived from the eigenvalue structure of the signal matrix. The paper further demonstrates that spectral clustering using the Bethe Hessian has weaker performance for minority communities compared to belief propagation.

Significance. If the central claims hold, the work offers a valuable theoretical extension of community detection limits to heterogeneous settings, providing explicit, parameter-dependent thresholds that can be tested against simulations. The explicit derivation of the phase boundaries from the signal matrix's characteristic equation is a strength, enabling clear predictions without data fitting. This highlights a specific limitation of spectral methods in fine-grained detection, which could inform algorithm choice in applications with imbalanced communities.

minor comments (3)
  1. Abstract: the signal matrix should be briefly defined or cross-referenced to the section introducing its construction, as the phase thresholds depend explicitly on its eigenvalues.
  2. Numerical experiments: report quantitative metrics such as normalized mutual information or overlap for the Bethe-Hessian vs. belief propagation comparison, along with standard deviations across realizations, to substantiate the claim of significantly weaker performance for minority communities.
  3. Phase definitions: provide precise criteria (e.g., in terms of eigenvalue crossings or overlap values) that distinguish the detectable, distinguishable, and resolvable phases in both theory and simulations.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive evaluation of our manuscript and for recommending minor revision. The referee's summary accurately reflects our central results on the three phases of detectability (detectable, distinguishable, and resolvable) for minority communities in the stochastic block model with heterogeneous sizes, as well as the connection to the Kesten-Stigum threshold and the additional thresholds arising from the eigenvalue structure of the signal matrix. We are pleased that the explicit, parameter-dependent nature of these boundaries and the observed limitations of the Bethe Hessian relative to belief propagation are recognized as strengths.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper constructs the signal matrix from the heterogeneous SBM parameters (community-size vector and affinity matrix), derives its eigenvalues via the characteristic equation, and obtains the phase boundaries (Kesten-Stigum plus two additional thresholds) as explicit functions of those parameters. This algebraic derivation is self-contained within the standard mean-field sparse-regime analysis of the SBM and does not reduce any reported threshold or phase to a fitted quantity, a self-referential definition, or a load-bearing self-citation. The numerical performance comparison between Bethe-Hessian spectral clustering and belief propagation is likewise generated directly from the same model without additional tuning or circular reduction. The analysis therefore qualifies as an independent first-principles result.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The work rests on the standard stochastic block model as the generative process and on linear-algebraic properties of its expected adjacency matrix; no new entities are postulated.

free parameters (1)
  • minority community sizes and intra/inter-connection probabilities
    These SBM parameters control the location of the phase transitions but are treated as given inputs rather than fitted outputs in the abstract.
axioms (1)
  • domain assumption The stochastic block model is an appropriate generative model for networks containing both majority and minority communities
    All phase transitions and algorithm comparisons are derived inside this model.

pith-pipeline@v0.9.0 · 5480 in / 1388 out tokens · 51820 ms · 2026-05-10T03:00:14.876134+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

48 extracted references · 5 canonical work pages · 1 internal anchor

  1. [1]

    Del Ferraro, A

    G. Del Ferraro, A. Moreno, B. Min, F. Morone, ´U. P´ erez- Ram´ ırez, L. P´ erez-Cervera, L. C. Parra, A. Holodny, S. Canals, and H. A. Makse, Finding influential nodes for integration in brain networks using optimal percola- tion theory, Nature communications9, 2274 (2018)

  2. [2]

    Fortunato, Community detection in graphs, Physics reports486, 75 (2010)

    S. Fortunato, Community detection in graphs, Physics reports486, 75 (2010)

  3. [3]

    Nicolini and A

    C. Nicolini and A. Bifone, Modular structure of brain functional networks: breaking the resolution limit by sur- prise, Scientific reports6, 19250 (2016)

  4. [4]

    Fortunato and M

    S. Fortunato and M. E. Newman, 20 years of network community detection, Nature Physics18, 848 (2022)

  5. [5]

    M. E. Newman and M. Girvan, Finding and evaluating community structure in networks, Physical review E69, 026113 (2004)

  6. [6]

    Karrer and M

    B. Karrer and M. E. Newman, Stochastic blockmodels and community structure in networks, Physical review E 83, 016107 (2011)

  7. [7]

    Saade, F

    A. Saade, F. Krzakala, and L. Zdeborov´ a, Spectral clus- tering of graphs with the bethe hessian, Advances in Neu- ral Information Processing Systems27(2014)

  8. [8]

    Dall’Amico, R

    L. Dall’Amico, R. Couillet, and N. Tremblay, Revisit- ing the bethe-hessian: improved community detection in sparse heterogeneous graphs, Advances in neural infor- mation processing systems32(2019)

  9. [9]

    Saade, Spectral inference methods on sparse graphs: theory and applications, arXiv preprint arXiv:1610.04337 (2016)

    A. Saade, Spectral inference methods on sparse graphs: theory and applications, arXiv preprint arXiv:1610.04337 (2016)

  10. [10]

    Leskovec, K

    J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Ma- honey, Community structure in large networks: Natural cluster sizes and the absence of large well-defined clus- ters, Internet Mathematics6, 29 (2009)

  11. [11]

    M. E. Newman, Fast algorithm for detecting community structure in networks, Physical Review E—Statistical, 12 Nonlinear, and Soft Matter Physics69, 066133 (2004)

  12. [12]

    P. M. Gleiser and L. Danon, Community structure in jazz, Advances in complex systems6, 565 (2003)

  13. [13]

    Guimera, L

    R. Guimera, L. Danon, A. Diaz-Guilera, F. Giralt, and A. Arenas, Self-similar community structure in a net- work of human interactions, Physical review E68, 065103 (2003)

  14. [14]

    P. W. Holland, K. B. Laskey, and S. Leinhardt, Stochastic blockmodels: First steps, Social networks5, 109 (1983)

  15. [15]

    C. E. Shannon, A mathematical theory of communica- tion, The Bell system technical journal27, 379 (1948)

  16. [16]

    Abbe and C

    E. Abbe and C. Sandon, Achieving the ks threshold in the general stochastic block model with linearized acyclic belief propagation, Advances in Neural Information Pro- cessing Systems29(2016)

  17. [17]

    Kesten and B

    H. Kesten and B. P. Stigum, Additional limit theorems for indecomposable multidimensional galton-watson pro- cesses, The Annals of Mathematical Statistics37, 1463 (1966)

  18. [18]

    Decelle, F

    A. Decelle, F. Krzakala, C. Moore, and L. Zdeborov´ a, Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications, Phys- ical Review E84, 066106 (2011)

  19. [19]

    Abbe, Community detection and stochastic block models: recent developments, Journal of Machine Learn- ing Research18, 1 (2018)

    E. Abbe, Community detection and stochastic block models: recent developments, Journal of Machine Learn- ing Research18, 1 (2018)

  20. [20]

    Mooreet al., The computer science and physics of community detection: Landscapes, phase transitions, and hardness, Bulletin of EATCS1(2017)

    C. Mooreet al., The computer science and physics of community detection: Landscapes, phase transitions, and hardness, Bulletin of EATCS1(2017)

  21. [21]

    Peel and M

    L. Peel and M. T. Schaub, Detectability of hierarchical communities in networks, Physical Review E110, 034306 (2024)

  22. [22]

    H. Wu, L. Fu, H. Long, G. Meng, X. Gan, Y. Wu, H. Zhang, and X. Wang, Unraveling the detectability of stochastic block model with overlapping communities, IEEE Transactions on Network Science and Engineering 8, 1443 (2021)

  23. [23]

    Wilinski, P

    M. Wilinski, P. Mazzarisi, D. Tantari, and F. Lillo, De- tectability of macroscopic structures in directed asym- metric stochastic block model, Physical Review E99, 042310 (2019)

  24. [24]

    Ghasemian, P

    A. Ghasemian, P. Zhang, A. Clauset, C. Moore, and L. Peel, Detectability thresholds and optimal algorithms for community structure in dynamic networks, Physical Review X6, 031005 (2016)

  25. [25]

    M. C. Angelini, F. Caltagirone, F. Krzakala, and L. Zde- borov´ a, Spectral detection on sparse hypergraphs, in 2015 53rd Annual Allerton Conference on Communica- tion, Control, and Computing (Allerton)(IEEE, 2015) pp. 66–73

  26. [26]

    Chodrow, N

    P. Chodrow, N. Eikmeier, and J. Haddock, Nonback- tracking spectral clustering of nonuniform hypergraphs, SIAM Journal on Mathematics of Data Science5, 251 (2023)

  27. [27]

    Ruggeri, A

    N. Ruggeri, A. Lonardi, and C. De Bacco, Message- passing on hypergraphs: detectability, phase transitions and higher-order information, Journal of Statistical Me- chanics: Theory and Experiment2024, 043403 (2024)

  28. [28]

    J. Li, M. T. Schaub, and L. Peel, Higher order trade- offs in hypergraph community detection, arXiv preprint arXiv:2601.10502 (2026)

  29. [29]

    B. Chin, E. Mossel, Y. Sohn, and A. S. Wein, Stochastic block models with many communities and the kesten– stigum bound, arXiv preprint arXiv:2503.03047 (2025)

  30. [30]

    Young, P

    J.-G. Young, P. Desrosiers, L. H´ ebert-Dufresne, E. Lau- rence, and L. J. Dub´ e, Finite-size analysis of the de- tectability limit of the stochastic block model, Physical Review E95, 062304 (2017)

  31. [31]

    Massouli´ e, Community detection thresholds and the weak ramanujan property, inProceedings of the forty- sixth annual ACM symposium on Theory of computing (2014) pp

    L. Massouli´ e, Community detection thresholds and the weak ramanujan property, inProceedings of the forty- sixth annual ACM symposium on Theory of computing (2014) pp. 694–703

  32. [32]

    Mossel, J

    E. Mossel, J. Neeman, and A. Sly, Belief propagation, ro- bust reconstruction and optimal recovery of block mod- els, inConference on Learning Theory(PMLR, 2014) pp. 356–370

  33. [33]

    Fortunato and M

    S. Fortunato and M. Barthelemy, Resolution limit in community detection, Proceedings of the national academy of sciences104, 36 (2007)

  34. [34]

    Krzakala, C

    F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly, L. Zdeborov´ a, and P. Zhang, Spectral redemption in clustering sparse networks, Proceedings of the National Academy of Sciences110, 20935 (2013)

  35. [35]

    Aicher, A

    C. Aicher, A. Z. Jacobs, and A. Clauset, Learning latent block structure in weighted networks, Journal of Complex Networks3, 221–248 (2014)

  36. [36]

    Chaudhuri, F

    K. Chaudhuri, F. Chung, and A. Tsiatas, Spectral clus- tering of graphs with general degrees in the extended planted partition model, inConference on Learning Theory(JMLR Workshop and Conference Proceedings,

  37. [37]

    Tsourakakis, Streaming graph partitioning in the planted partition model, inProceedings of the 2015 ACM on Conference on Online Social Networks(2015) pp

    C. Tsourakakis, Streaming graph partitioning in the planted partition model, inProceedings of the 2015 ACM on Conference on Online Social Networks(2015) pp. 27– 35

  38. [38]

    Condon and R

    A. Condon and R. M. Karp, Algorithms for graph parti- tioning on the planted partition model, inInternational Workshop on Randomization and Approximation Tech- niques in Computer Science(Springer, 1999) pp. 221– 232

  39. [39]

    Stephan and Y

    L. Stephan and Y. Zhu, Community detection with the bethe-hessian, arXiv preprint arXiv:2411.02835 (2024)

  40. [40]

    Bordenave, M

    C. Bordenave, M. Lelarge, and L. Massouli´ e, Non- backtracking spectrum of random graphs: community detection and non-regular ramanujan graphs, in2015 IEEE 56th Annual Symposium on Foundations of Com- puter Science(IEEE, 2015) pp. 1347–1357

  41. [41]

    Abbe and C

    E. Abbe and C. Sandon, Detection in the stochastic block model with multiple clusters: proof of the achievability conjectures, acyclic bp, and the information-computation gap, arXiv preprint arXiv:1512.09080 (2015)

  42. [42]

    J. R. Silvester, Determinants of block matrices, The Mathematical Gazette84, 460 (2000)

  43. [43]

    N. X. Vinh, J. Epps, and J. Bailey, Information theo- retic measures for clusterings comparison: is a correction for chance necessary?, inProceedings of the 26th annual international conference on machine learning(2009) pp. 1073–1080

  44. [44]

    Zdeborov´ a and F

    L. Zdeborov´ a and F. Krzakala, Statistical physics of in- ference: Thresholds and algorithms, Advances in Physics 65, 453 (2016)

  45. [45]

    A. P. Dempster, N. M. Laird, and D. B. Rubin, Maxi- mum likelihood from incomplete data via the em algo- rithm, Journal of the royal statistical society: series B (methodological)39, 1 (1977)

  46. [46]

    T. P. Peixoto, Parsimonious module inference in large networks, Physical review letters110, 148701 (2013). 13

  47. [47]

    Zhang, C

    P. Zhang, C. Moore, and M. Newman, Community de- tection in networks with unequal groups, Physical review E93, 012303 (2016)

  48. [48]

    ρ qs pin −λ− − ρ qs δ+λ ρ qs δ−λ ρ qs pout(qs −1) − − ρ qs δ+λ 1−ρ qb δ−λ 1−ρ qb poutqb # = ρ qs δ−λ qs−1 · 1−ρ qb δ−λ qb ·

    K. B. Petersen, M. S. Pedersen,et al., The matrix cook- book, Technical University of Denmark7, 510 (2008). Appendix A: Numerical Considerations for F ree Energy-Based Order Selection The order selection algorithm presented in Section 2 relies on minimizing the free energyf q to determine the number of communities. However, statistical fluctuations and nu...