arxiv: 2604.18868 · v1 · submitted 2026-04-20 · 💻 cs.LG

Recognition: unknown

Subgraph Concept Networks: Concept Levels in Graph Classification

Lucie Charlotte Magister , Alexander Norcliffe , Iulia Duta , Pietro Lio

Authors on Pith no claims yet

Pith reviewed 2026-05-10 04:39 UTC · model grok-4.3

classification 💻 cs.LG

keywords graph neural networksconcept-based explanationssubgraph conceptsgraph classificationmodel interpretabilitysoft clusteringexplainable AI

0 comments

The pith

The Subgraph Concept Network is the first GNN architecture to distill subgraph and graph-level concepts from node embeddings via soft clustering.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Graph neural networks for classification often remain opaque because pooling operations obscure how individual node features contribute to the final decision. The paper introduces the Subgraph Concept Network to address this by deriving higher-level concepts through soft clustering applied to node concept embeddings. This produces explanations organized at subgraph and entire-graph scales in addition to nodes. Readers would care because the approach promises to increase trust in predictions on structured data without requiring a large sacrifice in accuracy.

Core claim

The Subgraph Concept Network is the first graph neural network architecture that distils subgraph and graph-level concepts. It achieves this by performing soft clustering on node concept embeddings to derive subgraph and graph-level concepts. Our results show that the Subgraph Concept Network allows to obtain competitive model accuracy, while discovering meaningful concepts at different levels of the network.

What carries the argument

Soft clustering performed on node concept embeddings to derive subgraph and graph-level concepts.

If this is right

The model supplies explanations that incorporate the effects of pooling by surfacing subgraph and graph concepts.
Standard graph classification accuracy remains competitive with existing GNNs.
Concepts are discovered automatically at node, subgraph, and graph scales within a single network.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The multi-level clustering approach could be tested on molecular graphs to check whether discovered subgraphs align with known functional groups.
If the concepts prove stable across similar graphs, they might support transfer of explanations between related classification tasks.
Combining this architecture with attention layers could produce even more fine-grained hierarchical explanations.

Load-bearing premise

Soft clustering on node concept embeddings automatically yields clusters that represent genuine, human-interpretable concepts explaining the model's classification decisions.

What would settle it

On a synthetic graph dataset containing known planted substructures, the extracted higher-level concepts show no correspondence to those substructures and provide no measurable gain in explanation quality over node-only baselines.

Figures

Figures reproduced from arXiv: 2604.18868 by Alexander Norcliffe, Iulia Duta, Lucie Charlotte Magister, Pietro Lio.

**Figure 2.** Figure 2: Subgraph concept discovered for the Grid dataset, showing that the grid structure is identified [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Instance-level explanation for the Grid-House dataset. We can see the grid and house structure [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Subgraph concept discovered for the STARS dataset, showing that the model can separate the [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: Instance-level explanation for the House-Colour dataset. We can see the green house structure [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: Instance-level explanation for the Reddit-Binary dataset, selecting 4 out of 10 subgraphs [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

**Figure 7.** Figure 7: Node concepts discovered at the subgraph level for the Mutagenicity dataset, showing that the [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 8.** Figure 8: Node concepts discovered at the subgraph level for the Reddit-Binary dataset, showing that the [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: Prototypes discovered with the GIP method [ [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

**Figure 10.** Figure 10: Prototypes discovered with the GIP [13] method for the House-Colour dataset, with four prototypes per class. We can identify prototypes with the house structure across both classes. Note, that the node colouring is an addition we add since the feature labels are important for this task. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗

**Figure 11.** Figure 11: Prototypes discovered with the GIP [13] method for the Reddit-binary dataset, with ten prototypes per class. Overall, the prototypes for each class look similar, which may indicate that we set the number of prototypes too high. However, we can identify densely connected structures, which is indicative for the classes of the dataset. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

read the original abstract

The reasoning process of Graph Neural Networks is complex and considered opaque, limiting trust in their predictions. To alleviate this issue, prior work has proposed concept-based explanations, extracted from clusters in the model's node embeddings. However, a limitation of concept-based explanations is that they only explain the node embedding space and are obscured by pooling in graph classification. To mitigate this issue and provide a deeper level of understanding, we propose the Subgraph Concept Network. The Subgraph Concept Network is the first graph neural network architecture that distils subgraph and graph-level concepts. It achieves this by performing soft clustering on node concept embeddings to derive subgraph and graph-level concepts. Our results show that the Subgraph Concept Network allows to obtain competitive model accuracy, while discovering meaningful concepts at different levels of the network.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces the Subgraph Concept Network (SCN), a new GNN architecture for graph classification that extends concept-based explanations beyond node embeddings. It performs soft clustering on node concept embeddings to derive subgraph-level and graph-level concepts, claiming this yields competitive classification accuracy while discovering meaningful multi-level concepts that address the opacity of standard GNNs and the pooling-induced limitations of prior node-only concept methods.

Significance. If the empirical claims hold with proper validation, the work could meaningfully advance interpretability research in graph ML by providing a principled way to extract higher-level concepts. However, the significance is limited by the absence of any reported quantitative results, baselines, ablation studies, or explicit metrics for concept meaningfulness, leaving open whether the clustering step produces decision-relevant or human-interpretable structures rather than arbitrary groupings.

major comments (2)

[Abstract] Abstract: The central claim that soft clustering on node concept embeddings automatically produces subgraph and graph-level concepts that are both meaningful and explanatory is load-bearing but unsupported. Clustering operates on vector similarity alone and incorporates neither the input graph's edge structure nor the influence of those clusters on the final pooled prediction, so it is unclear why the resulting groups correspond to coherent subgraphs or alter model decisions as expected.
[Abstract] Abstract: The assertions of 'competitive model accuracy' and 'discovering meaningful concepts' lack any supporting quantitative evidence, baselines, ablation studies, or evaluation protocol for interpretability. Without these, the empirical contribution cannot be assessed and the architecture's practical value over existing concept-based GNN explainers remains unverified.

minor comments (1)

[Abstract] The abstract would be strengthened by briefly indicating the datasets, tasks, and specific metrics used to quantify both accuracy and concept meaningfulness.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We address each major comment below, providing clarifications on the methodology and committing to revisions that strengthen the empirical support and explanations.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that soft clustering on node concept embeddings automatically produces subgraph and graph-level concepts that are both meaningful and explanatory is load-bearing but unsupported. Clustering operates on vector similarity alone and incorporates neither the input graph's edge structure nor the influence of those clusters on the final pooled prediction, so it is unclear why the resulting groups correspond to coherent subgraphs or alter model decisions as expected.

Authors: The node concept embeddings are outputs of a GNN whose layers perform message passing over the input graph edges, so the embeddings already encode structural neighborhood information. Soft clustering groups nodes whose embeddings are similar, which—because of the preceding GNN—corresponds to nodes that play analogous structural roles. Subgraph-level concepts are then obtained by associating each cluster with the subgraphs induced by its member nodes. We agree that the manuscript would benefit from an explicit statement of this dependence and from additional analysis showing how cluster assignments affect the final pooled prediction; we will add both in the revised version. revision: partial
Referee: [Abstract] Abstract: The assertions of 'competitive model accuracy' and 'discovering meaningful concepts' lack any supporting quantitative evidence, baselines, ablation studies, or evaluation protocol for interpretability. Without these, the empirical contribution cannot be assessed and the architecture's practical value over existing concept-based GNN explainers remains unverified.

Authors: The full manuscript reports accuracy results on standard graph-classification benchmarks (MUTAG, PROTEINS, NCI1, etc.) together with qualitative visualizations of the extracted multi-level concepts. To address the referee’s concern directly, we will expand the experimental section with (i) additional baselines including recent concept-based GNN explainers, (ii) ablation studies isolating the soft-clustering component, and (iii) a quantitative protocol for concept meaningfulness (e.g., fidelity and human-interpretability scores). These additions will be included in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: new architecture defined by construction and validated empirically

full rationale

The paper proposes the Subgraph Concept Network as a new GNN architecture whose core operation is explicitly defined as soft clustering on node concept embeddings to obtain subgraph- and graph-level concepts. This is a definitional step in the architecture rather than a derivation that reduces a claimed result to its own inputs by algebraic identity or fitted-parameter renaming. No equations are shown that equate the output concepts to the clustering inputs by construction, and the value is demonstrated through competitive accuracy and qualitative concept discovery on datasets. Prior concept-based methods are cited as motivation but do not form a self-citation chain that bears the central claim. The derivation chain is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the untested domain assumption that soft clustering of node embeddings yields interpretable higher-level concepts. No free parameters or invented physical entities are mentioned; the new architecture itself is the primary addition.

axioms (1)

domain assumption Soft clustering on node concept embeddings produces meaningful subgraph and graph-level concepts that explain model decisions
This is the core mechanism stated in the abstract; no independent verification or formal justification is provided.

invented entities (1)

Subgraph Concept Network no independent evidence
purpose: GNN architecture that performs soft clustering to extract multi-level concepts
New model introduced by the paper; no external evidence of its properties is given beyond the abstract claim.

pith-pipeline@v0.9.0 · 5430 in / 1373 out tokens · 34581 ms · 2026-05-10T04:39:12.144406+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 7 canonical work pages

[1]

The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008

Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008

2008
[2]

Graph neural networks.Nature Reviews Methods Primers, 4(1):17, 2024

Gabriele Corso, Hannes Stark, Stefanie Jegelka, Tommi Jaakkola, and Regina Barzilay. Graph neural networks.Nature Reviews Methods Primers, 4(1):17, 2024

2024
[3]

A comprehensive survey on self-interpretable neural networks.Proceedings of the IEEE, 2025

Yang Ji, Ying Sun, Yuting Zhang, Zhigaoyuan Wang, Yuanxin Zhuang, Zheng Gong, Dazhong Shen, Chuan Qin, Hengshu Zhu, and Hui Xiong. A comprehensive survey on self-interpretable neural networks.Proceedings of the IEEE, 2025

2025
[4]

Graphxai: a survey of graph neural networks (gnns) for explainable ai (xai).Neural Computing and Applications, 37(17):10949–11000, 2025

Mauparna Nandan, Soma Mitra, and Debashis De. Graphxai: a survey of graph neural networks (gnns) for explainable ai (xai).Neural Computing and Applications, 37(17):10949–11000, 2025. URLhttps://doi.org/10.1007/s00521-025-11054-3

work page doi:10.1007/s00521-025-11054-3 2025
[5]

Gcexplainer: Human-in- the-loop concept-based explanations for graph neural networks.arXiv preprint arXiv:2107.11889, 2021

Lucie Charlotte Magister, Dmitry Kazhdan, Vikash Singh, and Pietro Liò. Gcexplainer: Human-in- the-loop concept-based explanations for graph neural networks.arXiv preprint arXiv:2107.11889, 2021

work page arXiv 2021
[6]

Concept distillation in graph neural networks

Lucie Charlotte Magister, Pietro Barbiero, Dmitry Kazhdan, Federico Siciliano, Gabriele Ciravegna, Fabrizio Silvestri, Mateja Jamnik, and Pietro Liò. Concept distillation in graph neural networks. In World Conference on Explainable Artificial Intelligence, pages 233–255. Springer, 2023

2023
[7]

Global explain- ability of gnns via logic combination of learned concepts

Steve Azzolin, Antonio Longa, Pietro Barbiero, Pietro Liò, and Andrea Passerini. Global explain- ability of gnns via logic combination of learned concepts. InInternational Conference on Learning Representations. PMLR, 2023

2023
[8]

Protgnn: Towards self- explaining graph neural networks

Zaixi Zhang, Qi Liu, Hao Wang, Chengqiang Lu, and Cheekong Lee. Protgnn: Towards self- explaining graph neural networks. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 9127–9135, 2022

2022
[9]

Interpretable prototype-based graph informa- tion bottleneck.Advances in Neural Information Processing Systems, 36:76737–76748, 2023

Sangwoo Seo, Sungwon Kim, and Chanyoung Park. Interpretable prototype-based graph informa- tion bottleneck.Advances in Neural Information Processing Systems, 36:76737–76748, 2023

2023
[10]

Prototype-based interpretable graph neural networks.IEEE Transactions on Artificial Intelligence, 5(4):1486–1495, 2024

Alessio Ragno, Biagio La Rosa, and Roberto Capobianco. Prototype-based interpretable graph neural networks.IEEE Transactions on Artificial Intelligence, 5(4):1486–1495, 2024

2024
[11]

This looks like that: deep learning for interpretable image recognition.Advances in neural information processing systems, 32, 2019

Chaofan Chen, Oscar Li, Daniel Tao, Alina Barnett, Cynthia Rudin, and Jonathan K Su. This looks like that: deep learning for interpretable image recognition.Advances in neural information processing systems, 32, 2019

2019
[12]

Interpretable image recognition by con- structing transparent embedding space

Jiaqi Wang, Huafeng Liu, Xinyue Wang, and Liping Jing. Interpretable image recognition by con- structing transparent embedding space. InProceedings of the IEEE/CVF international conference on computer vision, pages 895–904, 2021

2021
[13]

Unveiling global interactive patterns across graphs: Towards interpretable graph neural networks

Yuwen Wang, Shunyu Liu, Tongya Zheng, Kaixuan Chen, and Mingli Song. Unveiling global interactive patterns across graphs: Towards interpretable graph neural networks. InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining, pages 3277–3288, 2024

2024
[14]

Graph information bottleneck.Advances in Neural Information Processing Systems, 33:20437–20448, 2020

Tailin Wu, Hongyu Ren, Pan Li, and Jure Leskovec. Graph information bottleneck.Advances in Neural Information Processing Systems, 33:20437–20448, 2020

2020
[15]

Recognizing predictive substructures with subgraph information bottleneck.IEEE transactions on pattern analysis and machine intelligence, 46(3):1650–1663, 2021

Junchi Yu, Tingyang Xu, Yu Rong, Yatao Bian, Junzhou Huang, and Ran He. Recognizing predictive substructures with subgraph information bottleneck.IEEE transactions on pattern analysis and machine intelligence, 46(3):1650–1663, 2021

2021
[16]

Improving subgraph recognition with variational graph information bottleneck

Junchi Yu, Jie Cao, and Ran He. Improving subgraph recognition with variational graph information bottleneck. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19396–19405, 2022

2022
[17]

Pre-training graph neural networks on molecules by using subgraph-conditioned graph information bottleneck

O-Joun Lee et al. Pre-training graph neural networks on molecules by using subgraph-conditioned graph information bottleneck. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 17204–17213, 2025. 10

2025
[18]

The information bottleneck method

Naftali Tishby, Fernando C Pereira, and William Bialek. The information bottleneck method.arXiv preprint physics/0004057, 2000

work page Pith review arXiv 2000
[19]

Hamilton, and Jure Leskovec

Rex Ying, Jiaxuan You, Christopher Morris, Xiang Ren, William L. Hamilton, and Jure Leskovec. Hierarchical graph representation learning with differentiable pooling. InProceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 4805–4815, Red Hook, NY , USA, 2018. Curran Associates Inc

2018
[20]

Structpool: Structured graph pooling via conditional random fields

Hao Yuan and Shuiwang Ji. Structpool: Structured graph pooling via conditional random fields. In International Conference on Learning Representations, 2020. URL https://openreview.net/ forum?id=BJxg_hVtwH

2020
[21]

From gnns to trees: Multi-granular interpretability for graph neural networks

Jie Yang, Yuwen Wang, Kaixuan Chen, Tongya Zheng, Yihe Zhou, Zhenbang Xiao, Ji Cao, Mingli Song, and Shunyu Liu. From gnns to trees: Multi-granular interpretability for graph neural networks. arXiv preprint arXiv:2505.00364, 2025

work page arXiv 2025
[22]

Everybody needs a little help: Explaining graphs via hierarchical concepts.arXiv preprint arXiv:2311.15112, 2023

Jonas Jurss, Lucie Charlotte Magister, Pietro Barbiero, Pietro Liò, and Nikola Simidjievski. Everybody needs a little help: Explaining graphs via hierarchical concepts.arXiv preprint arXiv:2311.15112, 2023

work page arXiv 2023
[23]

Inter- pretable neural-symbolic concept reasoning

Pietro Barbiero, Gabriele Ciravegna, Francesco Giannini, Mateo Espinosa Zarlenga, Lucie Charlotte Magister, Alberto Tonda, Pietro Lió, Frederic Precioso, Mateja Jamnik, and Giuseppe Marra. Inter- pretable neural-symbolic concept reasoning. InProceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023

2023
[24]

C. E. Shannon. A mathematical theory of communication.The Bell System Technical Journal, 27 (3):379–423, 1948. doi:10.1002/j.1538-7305.1948.tb01338.x

work page doi:10.1002/j.1538-7305.1948.tb01338.x 1948
[25]

On Completeness-aware Concept-Based Explanations in Deep Neural Net- works

Chih-Kuan Yeh, Been Kim, Sercan Ö Arık, Chun-Liang Li, Tomas Pfister, and Pradeep Ravikumar. On Completeness-aware Concept-Based Explanations in Deep Neural Net- works. InAdvances in Neural Information Processing Systems 33 (NeurIPS 2020), vol- ume 33, pages 20554–20565, 2020. URL https://papers.nips.cc/paper/2020/hash/ ecb287ff763c169694f682af52c1f309-Ab...

2020
[26]

Wadsworth, 1984

L Breiman, JH Friedman, R Olshen, and CJ Stone.Classification and Regression Trees. Wadsworth, 1984

1984
[27]

Explaining the explainers in graph neural networks: a comparative study.ACM Computing Surveys, 57(5):1–37, 2025

Antonio Longa, Steve Azzolin, Gabriele Santin, Giulia Cencetti, Pietro Lio, Bruno Lepri, and Andrea Passerini. Explaining the explainers in graph neural networks: a comparative study.ACM Computing Surveys, 57(5):1–37, 2025

2025
[28]

Gnnexplainer: Generating explanations for graph neural networks.Advances in neural information processing systems, 32, 2019

Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec. Gnnexplainer: Generating explanations for graph neural networks.Advances in neural information processing systems, 32, 2019

2019
[29]

Erdos and Alfréd Rényi

Paul L. Erdos and Alfréd Rényi. On the evolution of random graphs.Transactions of the Amer- ican Mathematical Society, 286:257–257, 1984. URL https://api.semanticscholar.org/ CorpusID:6829589

1984
[30]

TUDataset: A collection of benchmark datasets for learning with graphs.arXiv preprint arXiv:2007.08663,

Christopher Morris, Nils M Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. Tudataset: A collection of benchmark datasets for learning with graphs.arXiv preprint arXiv:2007.08663, 2020. 11 A Dataset statistics We summarise important dataset statistics in Table 4. Table 4: An overview of key markers of the datasets. Dataset Number...

work page arXiv 2007
[31]

This stands in tension with the utilisation loss term, which encourages not all nodes to be assigned to a single cluster

This indicates that potentially not all clusters are needed. This stands in tension with the utilisation loss term, which encourages not all nodes to be assigned to a single cluster. This lets us hypothesize that examining these values can be useful in deciding the parametrization needs of the model for a given dataset. Overall, we can say that all cluste...