Identifying vital nodes based on reverse greedy method

Simiao Liu; Tao Ren; Tao Zhou; Yanjie Xu; Yi Qi; Yixin Zhang; Zhe Li

arxiv: 1907.01388 · v1 · pith:7SOS7HJInew · submitted 2019-07-02 · ⚛️ physics.soc-ph · cs.SI· physics.data-an

Identifying vital nodes based on reverse greedy method

Tao Ren , Zhe Li , Yi Qi , Yixin Zhang , Simiao Liu , Yanjie Xu , Tao Zhou This is my paper

Pith reviewed 2026-05-25 10:29 UTC · model grok-4.3

classification ⚛️ physics.soc-ph cs.SIphysics.data-an

keywords vital nodesnetwork connectivitygreedy algorithmreverse greedynode rankingcomplex networksinduced subgraphconnectivity maintenance

0 comments

The pith

A reverse greedy method ranks nodes by their importance to network connectivity by selecting the least important ones first.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a reverse greedy method for identifying vital nodes that maintain network connectivity. The method preferentially selects the least important nodes such that the size of the largest connected component in the induced subgraph is minimized. Nodes selected later in this process are considered more vital. Empirical tests on ten real networks demonstrate that this approach outperforms several state-of-the-art methods. Identifying these nodes is key to understanding and controlling the structure of complex networks.

Core claim

The reverse greedy method identifies vital nodes by iteratively choosing the least important nodes to make the size of the largest component in the corresponding induced subgraph as small as possible, with nodes chosen later being more important in maintaining the connectivity.

What carries the argument

The reverse greedy method that minimizes the largest connected component size in the induced subgraph by preferring least important nodes.

Load-bearing premise

That minimizing the size of the largest connected component when preferentially including least-important nodes produces a ranking that correctly identifies nodes vital for overall network connectivity.

What would settle it

Observing that on additional networks or controlled examples the reverse greedy ranking does not correlate with the actual impact of removing nodes on connectivity.

Figures

Figures reproduced from arXiv: 1907.01388 by Simiao Liu, Tao Ren, Tao Zhou, Yanjie Xu, Yi Qi, Yixin Zhang, Zhe Li.

**Figure 1.** Figure 1: The process of RG in a network with six nodes. Here we use degree as the feature f and for convenience in the later description we set ε = 0.01 (note that, we only require that ε is positive yet small enough). Initially, the network is empty. At the first time step, G max 1 (v1) = G max 1 (v2) = G max 1 (v3) = G max 1 (v4) = G max 1 (v5) = G max 1 (v6) = 1, and cost(v1,1) = 1.04, cost(v2,1) = 1.03, cost(v3… view at source ↗

**Figure 2.** Figure 2: Comparing the performance of the background benchmark (random removal, denoted by blue circles), RG (red stars) and the other seven benchmark algorithms (black symbols). The X-axis is the fraction of nodes being removed (i.e., Q/N), and the Y-axis denotes the number of nodes in the largest component divided by N (i.e., S(Q)). The four selected networks are (a) Email, (b) Sex, (c) Facebook, and (d) HepPh, r… view at source ↗

read the original abstract

The identification of vital nodes that maintain the network connectivity is a long-standing challenge in network science. In this paper, we propose a so-called reverse greedy method where the least important nodes are preferentially chosen to make the size of the largest component in the corresponding induced subgraph as small as possible. Accordingly, the nodes being chosen later are more important in maintaining the connectivity. Empirical analyses on ten real networks show that the reverse greedy method performs remarkably better than well-known state-of-the-art methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The reverse greedy ranking is a straightforward new procedure that the authors show outperforming several baselines on ten real networks, with no obvious internal contradictions in the setup.

read the letter

The main takeaway is that this paper defines a reverse greedy procedure for vital node ranking: nodes are added in an order that keeps the largest connected component as small as possible at each step, so the nodes selected later rank higher for maintaining connectivity. That reversal is the concrete novelty relative to standard greedy approaches in the literature. The work is grounded in an observable quantity (LCC size of induced subgraphs) and reports direct comparisons on ten real networks, which is the right kind of evidence for an algorithmic claim like this. The method itself is simple to implement and does not rely on fitted parameters or circular definitions. On the positive side, the empirical framing lets readers check the claim against the same networks and metric. The soft spots are modest but real. The abstract's phrasing of “remarkably better” needs the actual tables or figures to judge effect sizes, which baselines were used, and whether differences survive multiple-testing corrections or hold across different network types. It is also worth checking whether the closest prior greedy variants are cited and compared head-to-head rather than only against a few well-known methods. Those are standard referee points rather than fatal gaps. This paper is aimed at network scientists who work on node importance and robustness measures. Readers who already follow the vital-node literature will get the most out of the direct comparisons; others may find it incremental. The idea is clear enough and the evaluation concrete enough that it deserves a serious referee rather than a desk reject.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a reverse greedy algorithm for ranking nodes by their importance to network connectivity: nodes are added in an order that keeps the largest connected component (LCC) of the induced subgraph as small as possible at each step, so that nodes selected later are deemed more vital. The central empirical claim is that this ordering outperforms well-known state-of-the-art vital-node methods on ten real networks.

Significance. If the reported performance advantage is reproducible and statistically supported, the work would contribute a parameter-free, connectivity-focused ranking procedure to the vital-node literature. The method directly optimizes an observable (LCC size) rather than relying on heuristic centrality scores, which is a conceptual strength. No machine-checked proofs or open code are mentioned, but the explicit comparison on multiple real networks is a positive feature.

major comments (2)

[Results] Results section (presumably containing the ten-network comparisons): the abstract states that the method 'performs remarkably better' but supplies no quantitative metrics (e.g., Kendall-τ, AUC on robustness curves, or statistical significance tests), no list of the ten networks, and no description of the exact baseline implementations. Without these details the central empirical claim cannot be evaluated and is therefore load-bearing for acceptance.
[Method] Method definition (likely §2 or §3): the reverse-greedy procedure is described only at a high level; it is unclear how ties in LCC size are broken, whether the algorithm is deterministic, and what the precise time complexity is. These ambiguities affect reproducibility of the reported rankings and must be resolved before the performance comparison can be trusted.

minor comments (2)

[Abstract] The abstract should be expanded to name the performance metric, the ten networks, and the baselines so that readers can immediately assess the scope of the claim.
[Method] Notation for the induced subgraph and LCC size should be introduced once and used consistently; currently the description mixes verbal and symbolic language.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments below and will revise the manuscript to improve clarity, reproducibility, and the self-contained nature of the central claims.

read point-by-point responses

Referee: [Results] Results section (presumably containing the ten-network comparisons): the abstract states that the method 'performs remarkably better' but supplies no quantitative metrics (e.g., Kendall-τ, AUC on robustness curves, or statistical significance tests), no list of the ten networks, and no description of the exact baseline implementations. Without these details the central empirical claim cannot be evaluated and is therefore load-bearing for acceptance.

Authors: We agree that the abstract does not include quantitative metrics or network details, which limits immediate evaluation of the performance claim. The results section of the manuscript does list the ten networks (with basic statistics) and reports comparisons against standard baselines using Kendall-τ and robustness-curve AUC, but these are not summarized in the abstract. In the revision we will expand the abstract to report the specific performance gains (e.g., average Kendall-τ improvement) and briefly note the baseline implementations and the use of statistical tests. This change will make the empirical claim self-contained without altering the underlying experiments. revision: yes
Referee: [Method] Method definition (likely §2 or §3): the reverse-greedy procedure is described only at a high level; it is unclear how ties in LCC size are broken, whether the algorithm is deterministic, and what the precise time complexity is. These ambiguities affect reproducibility of the reported rankings and must be resolved before the performance comparison can be trusted.

Authors: We acknowledge that the current method description is high-level and omits tie-breaking, determinism, and complexity. In the revised manuscript we will add an explicit statement that ties are broken by lowest node index (ensuring the procedure is fully deterministic) and will include a complexity analysis of the naïve implementation together with a note on possible optimizations. These additions will be placed in the method section and will not change the algorithm itself. revision: yes

Circularity Check

0 steps flagged

No significant circularity: independent algorithmic definition evaluated on external networks

full rationale

The reverse greedy method is defined directly as an algorithmic procedure: nodes are selected in order of preference to minimize the LCC size of the induced subgraph, with later-selected nodes ranked as more important for connectivity. This definition stands alone without reference to the target ranking or performance metric. The central claim is an empirical comparison of this ranking against other methods on ten real networks, which is externally falsifiable and does not reduce to any fitted parameter, self-definition, or self-citation chain. No load-bearing steps match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

No free parameters or invented entities are introduced. The approach rests on one domain assumption about using largest-component size as a proxy for connectivity importance.

axioms (1)

domain assumption The size of the largest connected component in the induced subgraph serves as a valid proxy for assessing a node's contribution to network connectivity.
This criterion directly drives the reverse selection order described in the abstract.

pith-pipeline@v0.9.0 · 5614 in / 1085 out tokens · 43788 ms · 2026-05-25T10:29:33.317386+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

[1]

Newman, M. E. J. Networks (Oxford University Press, Oxford, 2018)

work page 2018
[2]

Scale-Free Networks: Complex Webs in Nature and Technology (Oxford University Press, Oxford, 2007)

Caldarelli, G. Scale-Free Networks: Complex Webs in Nature and Technology (Oxford University Press, Oxford, 2007)

work page 2007
[3]

& Havlin, S

Cohen, R., Erez, K., Ben-Avraham, D. & Havlin, S. Breakdown of the internet under intentional attack. Phys. Rev. Lett. 86, 3682-3685 (2001)

work page 2001
[4]

Motter, A. E. & Lai, Y . C. Cascade-based attacks on complex networks.Phys. Rev. E 66, 065102 (2002)

work page 2002
[5]

Motter, A. E. Cascade control and defense in complex networks. Phys. Rev. Lett. 93, 098701 (2004)

work page 2004
[6]

& Nakarado, G

Albert, R., Albert, I. & Nakarado, G. L. Structural vulnerability of the North American power grid.Phys. Rev. E 69, 025103 (2004)

work page 2004
[7]

& Barabási, A

Albert, R., Jeong, H. & Barabási, A. L. Error and attack tolerance of complex networks. Nature 406, 378–382 (2000)

work page 2000
[8]

E.,& Havlind, S

Li, D., Fu, B., Wang, Y ., Lu, G., Berezin, Y ., Stanley, H. E.,& Havlind, S. Percolation transition in dynamical trafﬁc network with evolving critical bottlenecks. Proc. Natl. Acad. Sci. USA 112, 669-672 (2015)

work page 2015
[9]

G., & May, R

Haldane, A. G., & May, R. M. Systemic risk in banking ecosystems. Nature 469, 351-355 (2011)

work page 2011
[10]

L., Zhang, Q

Lü, L., Chen, D., Ren, X. L., Zhang, Q. M., Zhang, Y . C.& Zhou, T. Vital nodes identiﬁcation in complex networks.Phys. Rep. 650, 1-63 (2016)

work page 2016
[11]

Factoring and weighting approaches to status scores and clique identiﬁcation.Math

Bonacich, P. Factoring and weighting approaches to status scores and clique identiﬁcation.Math. Sociol. 2, 113-120 (1972)

work page 1972
[12]

Lü, L., Zhou, T., Zhang, Q. M. & Stanley, H. E. The H-index of a network node and its relation to degree and coreness. Nat. Commun. 7, 10168 (2016)

work page 2016
[13]

K., Havlin, S., Liljeros, F., Muchnik, L.& Stanley, H

Kitsak, M., Gallos, L. K., Havlin, S., Liljeros, F., Muchnik, L.& Stanley, H. E. Identiﬁcation of inﬂuential spreaders in complex networks. Nat. Phys. 6, 888-893 (2010)

work page 2010
[14]

& Page, L

Brin, S. & Page, L. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30, 107–117 (1998)

work page 1998
[15]

Leaders in social networks, the delicious case

Lü, L., Zhang, Y .-C., Yeung, C.-H.& Zhou, T. Leaders in social networks, the delicious case. PLoS ONE 6, e21202 (2011)

work page 2011
[16]

Freeman, L. C. Centrality in social networks conceptual clariﬁcation.Soc. Networks 1, 215-239 (1979)

work page 1979
[17]

Freeman, L. C. A set of measures of centrality based on betweenness. Sociometry 40, 35-41 (1977)

work page 1977
[18]

& Makse, H

Morone, F. & Makse, H. A. Inﬂuence maximization in complex networks through optimal percolation. Nature 524, 65-68 (2015). 5/6

work page 2015
[19]

& Makse, H

Morone, F., Min, B., Bo, L., Mari, R. & Makse, H. A. Collective inﬂuence algorithm to ﬁnd inﬂuencers via optimal percolation in massively large social media. Sci. Rep. 6, 30062 (2016)

work page 2016
[20]

& Danon, L

Gleiser, P. & Danon, L. Community structure in Jazz.Adv. Complex Syst. 6, 565 (2003)

work page 2003
[21]

Newman, M. E. J. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 036104 (2006)

work page 2006
[22]

& Arenas, A

Guimer˙a, R., Danon, L., Díaz-Guilera, A., Giralt, F. & Arenas, A. Self-similar community structure in a network of human interactions. Phys. Rev. E 68, 065103 (2003)

work page 2003
[23]

Adamic, L. A. & Glance, N. The political blogosphere and the 2004 U.S. election: divided they blog. In Proceedings of the 3rd International Workshop on Link Discovery (ACM Press, 2005, pp. 36-43)

work page 2004
[24]

E., Liljeros, F

Rocha, L. E., Liljeros, F. & Holme, P. Simulated epidemics in an empirical spatiotemporal network of 50,185 sexual contacts. PLoS Comput. Biol. 7, e1001109 (2011)

work page 2011
[25]

& Gummadi, K

Viswanath, B., Mislove, A., Cha, M. & Gummadi, K. P. On the Evolution of User Interaction in Facebook. In Proceedings of the 2nd ACM Workshop on Online Social Networks (ACM Press, 2009, pp. 37–42)

work page 2009
[26]

& Mrvar, A

Batageli, V . & Mrvar, A. Pajek Datasets. Available at http://vlado.fmf.uni-lj.si/pub/networks/data/ (2007)

work page 2007
[27]

Watts, D. J. & Strogatz, S. H. Collective dynamics of ‘small-world’ networks. Nature 393, 440-442 (1998)

work page 1998
[28]

& Anderson, T

Spring, N., Mahajan, R., Wetherall, D. & Anderson, T. Measuring ISP topologies with rocketfuel. IEEE/ACM Trans. Networking 12, 2-16 (2004)

work page 2004
[29]

& Kleinberg, J

Gehrke, J., Ginsparg, P. & Kleinberg, J. Overview of the 2003 KDD Cup.SIGKDD Explorations 5, 149-151 (2003)

work page 2003
[30]

Newman, M. E. J. Assortative mixing in networks. Phys. Rev. Lett. 89, 208701 (2002)

work page 2002
[31]

Hu, H. B. & Wang, X. F. Uniﬁed index to quantifying heterogeneity of complex networks. Physica A 387, 3769-3780 (2008)

work page 2008
[32]

M., Moreira, A

Schneider, C. M., Moreira, A. A., Andrade, J. S., Havlin, S.& Herrmann, H. J. Mitigation of malicious attacks on networks. Proc. Natl. Acad. Sci. U.S.A. 108, 3838-3841 (2011)

work page 2011
[33]

& Hansen, E

Zhou, R. & Hansen, E. A. Beam-stack search: Integrating backtracking with beam search. In Proceedings of the 15th International Conference on Automated Planning and Scheduling (AAAI Press, 2005, pp. 90–98)

work page 2005
[34]

Hirsch, J. E. An index to quantify an individual’s scientiﬁc research output.Proc. Natl. Acad. Sci. U.S.A. 102, 16569–16572 (2005). Acknowledgements The authors acknowledge DataCastle to hold the related world-wide competition and to share the data. This work is partially supported by National Natural Science Foundation of China (61473073, 61104074, 61433...

work page 2005

[1] [1]

Newman, M. E. J. Networks (Oxford University Press, Oxford, 2018)

work page 2018

[2] [2]

Scale-Free Networks: Complex Webs in Nature and Technology (Oxford University Press, Oxford, 2007)

Caldarelli, G. Scale-Free Networks: Complex Webs in Nature and Technology (Oxford University Press, Oxford, 2007)

work page 2007

[3] [3]

& Havlin, S

Cohen, R., Erez, K., Ben-Avraham, D. & Havlin, S. Breakdown of the internet under intentional attack. Phys. Rev. Lett. 86, 3682-3685 (2001)

work page 2001

[4] [4]

Motter, A. E. & Lai, Y . C. Cascade-based attacks on complex networks.Phys. Rev. E 66, 065102 (2002)

work page 2002

[5] [5]

Motter, A. E. Cascade control and defense in complex networks. Phys. Rev. Lett. 93, 098701 (2004)

work page 2004

[6] [6]

& Nakarado, G

Albert, R., Albert, I. & Nakarado, G. L. Structural vulnerability of the North American power grid.Phys. Rev. E 69, 025103 (2004)

work page 2004

[7] [7]

& Barabási, A

Albert, R., Jeong, H. & Barabási, A. L. Error and attack tolerance of complex networks. Nature 406, 378–382 (2000)

work page 2000

[8] [8]

E.,& Havlind, S

Li, D., Fu, B., Wang, Y ., Lu, G., Berezin, Y ., Stanley, H. E.,& Havlind, S. Percolation transition in dynamical trafﬁc network with evolving critical bottlenecks. Proc. Natl. Acad. Sci. USA 112, 669-672 (2015)

work page 2015

[9] [9]

G., & May, R

Haldane, A. G., & May, R. M. Systemic risk in banking ecosystems. Nature 469, 351-355 (2011)

work page 2011

[10] [10]

L., Zhang, Q

Lü, L., Chen, D., Ren, X. L., Zhang, Q. M., Zhang, Y . C.& Zhou, T. Vital nodes identiﬁcation in complex networks.Phys. Rep. 650, 1-63 (2016)

work page 2016

[11] [11]

Factoring and weighting approaches to status scores and clique identiﬁcation.Math

Bonacich, P. Factoring and weighting approaches to status scores and clique identiﬁcation.Math. Sociol. 2, 113-120 (1972)

work page 1972

[12] [12]

Lü, L., Zhou, T., Zhang, Q. M. & Stanley, H. E. The H-index of a network node and its relation to degree and coreness. Nat. Commun. 7, 10168 (2016)

work page 2016

[13] [13]

K., Havlin, S., Liljeros, F., Muchnik, L.& Stanley, H

Kitsak, M., Gallos, L. K., Havlin, S., Liljeros, F., Muchnik, L.& Stanley, H. E. Identiﬁcation of inﬂuential spreaders in complex networks. Nat. Phys. 6, 888-893 (2010)

work page 2010

[14] [14]

& Page, L

Brin, S. & Page, L. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30, 107–117 (1998)

work page 1998

[15] [15]

Leaders in social networks, the delicious case

Lü, L., Zhang, Y .-C., Yeung, C.-H.& Zhou, T. Leaders in social networks, the delicious case. PLoS ONE 6, e21202 (2011)

work page 2011

[16] [16]

Freeman, L. C. Centrality in social networks conceptual clariﬁcation.Soc. Networks 1, 215-239 (1979)

work page 1979

[17] [17]

Freeman, L. C. A set of measures of centrality based on betweenness. Sociometry 40, 35-41 (1977)

work page 1977

[18] [18]

& Makse, H

Morone, F. & Makse, H. A. Inﬂuence maximization in complex networks through optimal percolation. Nature 524, 65-68 (2015). 5/6

work page 2015

[19] [19]

& Makse, H

Morone, F., Min, B., Bo, L., Mari, R. & Makse, H. A. Collective inﬂuence algorithm to ﬁnd inﬂuencers via optimal percolation in massively large social media. Sci. Rep. 6, 30062 (2016)

work page 2016

[20] [20]

& Danon, L

Gleiser, P. & Danon, L. Community structure in Jazz.Adv. Complex Syst. 6, 565 (2003)

work page 2003

[21] [21]

Newman, M. E. J. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 036104 (2006)

work page 2006

[22] [22]

& Arenas, A

Guimer˙a, R., Danon, L., Díaz-Guilera, A., Giralt, F. & Arenas, A. Self-similar community structure in a network of human interactions. Phys. Rev. E 68, 065103 (2003)

work page 2003

[23] [23]

Adamic, L. A. & Glance, N. The political blogosphere and the 2004 U.S. election: divided they blog. In Proceedings of the 3rd International Workshop on Link Discovery (ACM Press, 2005, pp. 36-43)

work page 2004

[24] [24]

E., Liljeros, F

Rocha, L. E., Liljeros, F. & Holme, P. Simulated epidemics in an empirical spatiotemporal network of 50,185 sexual contacts. PLoS Comput. Biol. 7, e1001109 (2011)

work page 2011

[25] [25]

& Gummadi, K

Viswanath, B., Mislove, A., Cha, M. & Gummadi, K. P. On the Evolution of User Interaction in Facebook. In Proceedings of the 2nd ACM Workshop on Online Social Networks (ACM Press, 2009, pp. 37–42)

work page 2009

[26] [26]

& Mrvar, A

Batageli, V . & Mrvar, A. Pajek Datasets. Available at http://vlado.fmf.uni-lj.si/pub/networks/data/ (2007)

work page 2007

[27] [27]

Watts, D. J. & Strogatz, S. H. Collective dynamics of ‘small-world’ networks. Nature 393, 440-442 (1998)

work page 1998

[28] [28]

& Anderson, T

Spring, N., Mahajan, R., Wetherall, D. & Anderson, T. Measuring ISP topologies with rocketfuel. IEEE/ACM Trans. Networking 12, 2-16 (2004)

work page 2004

[29] [29]

& Kleinberg, J

Gehrke, J., Ginsparg, P. & Kleinberg, J. Overview of the 2003 KDD Cup.SIGKDD Explorations 5, 149-151 (2003)

work page 2003

[30] [30]

Newman, M. E. J. Assortative mixing in networks. Phys. Rev. Lett. 89, 208701 (2002)

work page 2002

[31] [31]

Hu, H. B. & Wang, X. F. Uniﬁed index to quantifying heterogeneity of complex networks. Physica A 387, 3769-3780 (2008)

work page 2008

[32] [32]

M., Moreira, A

Schneider, C. M., Moreira, A. A., Andrade, J. S., Havlin, S.& Herrmann, H. J. Mitigation of malicious attacks on networks. Proc. Natl. Acad. Sci. U.S.A. 108, 3838-3841 (2011)

work page 2011

[33] [33]

& Hansen, E

Zhou, R. & Hansen, E. A. Beam-stack search: Integrating backtracking with beam search. In Proceedings of the 15th International Conference on Automated Planning and Scheduling (AAAI Press, 2005, pp. 90–98)

work page 2005

[34] [34]

Hirsch, J. E. An index to quantify an individual’s scientiﬁc research output.Proc. Natl. Acad. Sci. U.S.A. 102, 16569–16572 (2005). Acknowledgements The authors acknowledge DataCastle to hold the related world-wide competition and to share the data. This work is partially supported by National Natural Science Foundation of China (61473073, 61104074, 61433...

work page 2005