Graph neural network explanations reveal a topological signature of disease-associated hubs in biological networks

Dennis Veselkov; Ivan Laponogov; Kirill Veselkov; Kyle Higgins

arxiv: 2605.21502 · v1 · pith:NAICUIUGnew · submitted 2026-05-08 · 🧬 q-bio.MN · cs.AI· cs.LG

Graph neural network explanations reveal a topological signature of disease-associated hubs in biological networks

Kyle Higgins , Ivan Laponogov , Dennis Veselkov , Kirill Veselkov This is my paper

Pith reviewed 2026-05-22 02:27 UTC · model grok-4.3

classification 🧬 q-bio.MN cs.AIcs.LG

keywords graph neural networksexplanation methodsbiological networkscancer hubstopological signatureprotein-protein interactionTCGA BRCAintegrated gradients

0 comments

The pith

Graph neural network explanations uncover a topological signature where attribution peaks next to disease hubs and decays with network distance in breast cancer data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper tests four common post-hoc explanation techniques on graph neural networks that model protein-protein interaction networks using breast cancer RNA-seq data. It demonstrates that integrated gradients and layer-wise relevance propagation recover a repeating pattern in which importance scores are highest in the immediate neighbors of disease hubs and fall off across farther network shells, with this pattern aligning to known cancer genes. Building on that observation, the authors combine a distance-shell hub score with consensus across explainers to produce rankings that better surface canonical cancer genes and coherent signaling pathways while depending less on simple node degree.

Core claim

In TCGA BRCA data projected onto a protein-protein interaction network, explanation attributions from graph neural networks display a consistent topological signature in which scores peak in the immediate one-hop neighborhood of disease-associated hubs and decay across successive network shells; this pattern is most pronounced for integrated gradients and layer-wise relevance propagation and coincides with strong enrichment for known cancer hubs, while a consensus framework that merges shell-based local scores with cross-method agreement improves prioritization of genes such as TP53, BRCA1, ESR1, and MYC and recovers biologically coherent programs including ERBB2, RTK, MAPK, immune, and Cytk

What carries the argument

The shell-based hub score that quantifies how explanation attribution changes across successive network distance shells from a candidate hub node, combined with consensus ranking across multiple explanation methods.

If this is right

Integrated gradients and layer-wise relevance propagation preferentially recover distributed pathway-like signals while saliency attribution favors sparse single-node drivers.
Consensus scores that blend local shell information with agreement across explainers improve prioritization of canonical cancer genes and reduce dependence on node degree.
Pathway enrichment of the resulting rankings recovers coherent cancer programs such as ERBB2, RTK, MAPK, immune, and cytokine signaling.
A trade-off exists between local hub enrichment, which favors IG and LRP, and global gene ranking performance, which favors saliency attribution.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same shell-decay signature could be tested as a general marker for disease-relevant modules in other complex networks beyond breast cancer.
Choosing an explanation method according to whether local neighborhood or global ranking is the goal may become a standard step in biological network analysis.
Training graph neural networks with explicit penalties or rewards for producing this topological signature might strengthen recovery of disease mechanisms.

Load-bearing premise

The observed attribution decay pattern and enrichment for known cancer genes reflect biologically meaningful disease mechanisms rather than artifacts of network topology or the chosen graph neural network architecture and training procedure.

What would settle it

Applying the same pipeline to shuffled gene-expression labels or edge-randomized networks and finding neither the decaying attribution signature nor statistically significant enrichment for known cancer genes.

Figures

Figures reproduced from arXiv: 2605.21502 by Dennis Veselkov, Ivan Laponogov, Kirill Veselkov, Kyle Higgins.

**Figure 2.** Figure 2: Quantitative comparison of attribution concentration on ground [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

read the original abstract

Graph neural networks (GNNs) are increasingly used to model biological systems, yet the reliability of post-hoc explanation methods for recovering meaningful molecular mechanisms remains unclear. Here, we systematically evaluate four widely used approaches: Saliency Attribution (SA), Integrated Gradients (IG), GNNExplainer, and Layer-wise Relevance Propagation (LRP) for identifying disease-relevant structure in breast cancer RNA-seq data projected onto a protein-protein interaction network. Using synthetic benchmarks with known ground-truth motifs, we show that explanation methods recover distinct signal organizations: SA performs best for sparse single-node drivers, whereas IG and LRP preferentially recover distributed pathway-like and cascade-like signals. In TCGA BRCA data, we identify a consistent topological signature of disease-associated hubs in which attribution peaks in the immediate 1-hop neighborhood and decays across successive network shells, a pattern most pronounced for IG and LRP and associated with strong enrichment of known cancer hubs. We further observe a trade-off between local hub enrichment and global gene ranking performance, with IG optimizing local enrichment and SA achieving superior global discrimination. Motivated by these complementary behaviors, we introduce a framework combining a shell-based hub score with consensus ranking across explainers. Consensus scores improve prioritization of canonical cancer genes (TP53, BRCA1, ESR1, MYC), reduce dependence on node degree, and, especially when tuned, outperform individual methods. Pathway enrichment further reveals improved recovery of biologically coherent cancer programs, including ERBB2, RTK, MAPK, immune, and cytokine signaling. Together, these results demonstrate that topology-aware integration of graph explanations can improve biological interpretability and biologically relevant molecular recovery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper finds a 1-hop attribution peak and outward decay around disease hubs on PPI networks with IG and LRP, plus a consensus explainer that improves cancer gene recovery, but the pattern could be a topology artifact.

read the letter

The main point is that GNN explanations on a PPI network for TCGA breast cancer data show attributions peaking in the 1-hop neighborhood of hubs and decaying in outer shells, especially for IG and LRP, and that a consensus method across explainers plus a shell-based score gives better prioritization of known cancer genes. What stands out is the systematic test on synthetic motifs. It separates the explainers by the kind of signal they recover: SA for single-node, IG and LRP for distributed cascades. That carries over to the real data where they see the decay pattern and link it to enrichment of cancer hubs. The consensus ranking they propose reduces reliance on degree and improves recovery of genes like TP53, BRCA1, ESR1, MYC along with coherent pathways such as ERBB2 and MAPK. Those are practical gains. The soft spot is the lack of controls for network topology effects. PPI networks are degree-heterogeneous, and GNN message passing with gradient explainers tends to concentrate scores near high-degree nodes anyway. The synthetic tests do not fully replicate that structure or the real label setup, so the decay could appear under null conditions. They would need rewired networks or shuffled labels to show the pattern is disease-driven rather than structural. Without that, the biological interpretation stays tentative even though the enrichment numbers look good. This is useful for people in network biology who use GNNs for gene ranking and want to make the outputs more interpretable. A reader interested in combining explanation methods will pick up a workable approach. The paper has enough empirical substance and a clear method to warrant sending it to referees, though they should ask for the topology controls.

Referee Report

2 major / 2 minor

Summary. The paper evaluates four post-hoc GNN explanation methods (SA, IG, GNNExplainer, LRP) on models trained to predict disease status from RNA-seq projected onto a PPI network. Synthetic motif benchmarks show method-specific recovery of sparse vs. distributed signals. In TCGA BRCA data the authors report a topological signature in which IG and LRP attributions peak in the 1-hop neighborhood and decay across successive shells, with strong enrichment for known cancer hubs; they introduce a consensus framework that combines shell-based hub scoring with cross-explainer ranking and claim improved prioritization of canonical cancer genes and coherent pathways.

Significance. If the central observations survive appropriate controls, the work supplies a concrete, topology-aware recipe for extracting biologically interpretable signals from GNN explanations in molecular networks. The reported trade-off between local hub enrichment and global ranking, together with the consensus improvement on TP53/BRCA1/ESR1/MYC and ERBB2/RTK/MAPK programs, would be a useful practical contribution to the interpretability literature in systems biology.

major comments (2)

[TCGA BRCA results] TCGA BRCA results section: the claim that the 1-hop peak and shell-wise decay constitutes a 'disease-associated' topological signature is load-bearing. PPI networks are strongly degree-heterogeneous and GNN message passing plus gradient-based explainers (IG, LRP) naturally concentrate relevance on high-degree nodes and their immediate neighborhoods. Without explicit null controls (randomized labels, degree-preserving rewired edges, or expression-shuffled data) the observed pattern remains compatible with an architectural artifact rather than a biological signal.
[Synthetic benchmarks] Synthetic-to-real translation paragraph: the motif benchmarks do not reproduce the heavy-tailed degree distribution or the continuous expression-label structure of real TCGA data. Consequently the differential performance of IG/LRP versus SA on synthetic motifs does not license the inference that the same methods are recovering disease-specific structure on the real network.

minor comments (2)

[Results] Ensure all quantitative claims (enrichment p-values, ranking improvements, consensus scores) are accompanied by error bars or confidence intervals and by the exact statistical tests used.
[Methods] Define the shell-based hub score and the precise consensus aggregation rule (including any tuning parameters) in a dedicated methods subsection so that the framework can be reproduced.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to incorporate additional controls and clarifications where needed, while preserving the core contributions.

read point-by-point responses

Referee: [TCGA BRCA results] TCGA BRCA results section: the claim that the 1-hop peak and shell-wise decay constitutes a 'disease-associated' topological signature is load-bearing. PPI networks are strongly degree-heterogeneous and GNN message passing plus gradient-based explainers (IG, LRP) naturally concentrate relevance on high-degree nodes and their immediate neighborhoods. Without explicit null controls (randomized labels, degree-preserving rewired edges, or expression-shuffled data) the observed pattern remains compatible with an architectural artifact rather than a biological signal.

Authors: We agree that degree heterogeneity in PPI networks can bias gradient-based attributions toward high-degree nodes and their neighborhoods, and that explicit null controls are required to establish the pattern as disease-associated rather than architectural. In the revised manuscript we will add three null-model experiments: (i) degree-preserving edge rewiring while keeping the original node degrees and expression values, (ii) randomization of disease labels, and (iii) shuffling of expression values across samples. These controls will be reported in a new supplementary figure and table showing that the 1-hop peak and shell-wise decay are substantially attenuated under the null conditions, while remaining statistically significant in the original data. We will also quantify the enrichment of known cancer hubs under each null model to demonstrate specificity. revision: yes
Referee: [Synthetic benchmarks] Synthetic-to-real translation paragraph: the motif benchmarks do not reproduce the heavy-tailed degree distribution or the continuous expression-label structure of real TCGA data. Consequently the differential performance of IG/LRP versus SA on synthetic motifs does not license the inference that the same methods are recovering disease-specific structure on the real network.

Authors: The referee is correct that the synthetic motif benchmarks employ simplified topologies and binary labels that do not replicate the heavy-tailed degree distribution or continuous expression values of TCGA data. These benchmarks were designed only to isolate the explainers' relative sensitivity to sparse single-node versus distributed pathway-like signals. In the revision we will explicitly state this scope limitation in the synthetic-to-real paragraph and clarify that claims about disease-specific structure in the TCGA results rest on (a) the observed enrichment of high-attribution nodes for canonical cancer genes and (b) the improved prioritization achieved by the consensus framework, rather than on direct extrapolation from the synthetic results. We will also add a short limitations paragraph acknowledging the gap between synthetic and real-data regimes. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation relies on independent benchmarks and external gene sets

full rationale

The paper first validates four explanation methods on synthetic benchmarks containing explicit ground-truth motifs, then applies the same methods to TCGA BRCA expression data projected on a PPI network, and finally constructs a consensus framework motivated by the observed complementary behaviors. The topological signature (1-hop peak and shell decay) is reported as an empirical pattern in real data, with enrichment evaluated against independently curated cancer gene lists and pathways. No equation or claim reduces a reported prediction to a fitted parameter by construction, no uniqueness theorem is imported from prior self-work, and the combined score is presented as a post-hoc integration rather than a self-defining tautology. The central observations therefore remain externally grounded rather than internally forced.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on standard assumptions from graph theory and prior GNN explanation literature; no new invented entities are introduced. The consensus framework likely involves tunable parameters for combining scores, but these are not detailed in the abstract.

axioms (1)

domain assumption Protein-protein interaction networks accurately capture relevant biological relationships for disease signal propagation
Implicit in projecting RNA-seq data onto the PPI network and interpreting attribution patterns as disease-relevant.

pith-pipeline@v0.9.0 · 5843 in / 1413 out tokens · 36383 ms · 2026-05-22T02:27:06.455004+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_strictMono_of_one_lt unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

attribution peaks in the immediate 1-hop neighborhood and decays across successive network shells... shell-based hub score
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

consistent topological signature of disease-associated hubs... enrichment of known cancer hubs

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

[1]

IG showed moderate degradation after residualization for some genes, for example BRCA1 falling from rank 900 to 2037, whereas GNNExplainer’s residualized ranks improved markedly but remained less biologically selective overall, consistent with a noisier and less targeted signal. Together, these results indicate that while some raw explainer scores are par...

work page 2037
[2]

eye-of-the-storm

SA highlighted neuronal and synaptic pathways, most prominently GABA receptor activation, neurotransmitter receptor signally, and transmission across chemical synapses, though it was able to recover potentially relevant oncogenic signaling such as ERBB4. LRP produced a more fragmented enrichment proﬁle centered on xenobiotic metabolism, cytochrome p450 ac...

work page doi:10.629219/2049 2022

[1] [1]

IG showed moderate degradation after residualization for some genes, for example BRCA1 falling from rank 900 to 2037, whereas GNNExplainer’s residualized ranks improved markedly but remained less biologically selective overall, consistent with a noisier and less targeted signal. Together, these results indicate that while some raw explainer scores are par...

work page 2037

[2] [2]

eye-of-the-storm

SA highlighted neuronal and synaptic pathways, most prominently GABA receptor activation, neurotransmitter receptor signally, and transmission across chemical synapses, though it was able to recover potentially relevant oncogenic signaling such as ERBB4. LRP produced a more fragmented enrichment proﬁle centered on xenobiotic metabolism, cytochrome p450 ac...

work page doi:10.629219/2049 2022