pith. machine review for the scientific record. sign in

arxiv: 2605.07527 · v1 · submitted 2026-05-08 · 💻 cs.LG · cs.AI

Recognition: 2 theorem links

· Lean Theorem

Why Self-Inconsistency Arises in GNN Explanations and How to Exploit It

Authors on Pith no claims yet

Pith reviewed 2026-05-11 02:10 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords self-inconsistent explanationsgraph neural networksSI-GNNself-denoisingcontext perturbationexplanation fidelitypost-processing calibrationlatent signal assignment
0
0 comments X

The pith

Self-inconsistency in GNN explanations arises from context perturbation during re-explanation and is corrected by Self-Denoising in one pass

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Self-interpretable graph neural networks can produce explanations whose importance scores change when the model is reapplied to only the selected edges. This occurs because removing the non-selected edges alters the input context seen by the model, which perturbs its internal computations. The paper traces the effect to a latent signal assignment process in which only certain edges carry unstable signals that shift under such changes, and shows that conciseness regularization can increase the number of affected edges. Self-Denoising then uses the observed score variation to identify and adjust these edges through a single extra forward pass, yielding more stable explanations.

Core claim

Re-explanation-induced context perturbation is the direct cause of score variation in SI-GNN explanations; only edges whose latent signals are unstable to this perturbation become self-inconsistent, and Self-Denoising exploits the variation by calibrating the explanation subgraph in one additional forward pass.

What carries the argument

Self-Denoising (SD), a training-free post-processing step that compares original and re-explained importance scores to detect and adjust self-inconsistent edges caused by context perturbation

If this is right

  • Self-inconsistent edges are identified simply by running the model once more on the explanation subgraph and comparing scores.
  • Adjusting or removing these edges raises the fidelity of the explanation to the model's true decision process.
  • The correction works on any SI-GNN without retraining or changes to the underlying architecture.
  • Explanation quality improves across multiple frameworks, backbones, and datasets while adding only 4-6 percent computational cost.
  • Conciseness regularization during explanation generation modulates how many edges exhibit unstable latent signals.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Explanation methods for other structured data may exhibit analogous inconsistency when subset selection changes surrounding context.
  • Stability under re-application could become a standard diagnostic for any subgraph-based explanation technique.
  • Explicitly encouraging stable signal assignment during model training might reduce the need for post-processing corrections.
  • The same perturbation analysis could be applied to detect brittle explanations in non-graph models where input masking alters predictions.

Load-bearing premise

That edges whose importance scores change when the model is reapplied to the explanation subgraph do not provide stable evidence for the original prediction and that adjusting them therefore produces a more faithful explanation.

What would settle it

Measure whether the denoised explanation subgraph, when fed back to the model, produces both the original prediction and consistent importance scores on a second re-explanation, and compare the result against standard faithfulness metrics on benchmark datasets.

Figures

Figures reproduced from arXiv: 2605.07527 by Fan Zhou, Ting Zhong, Wenxin Tai, Yaqian Liu.

Figure 1
Figure 1. Figure 1: Illustration of self-inconsistency in SI-GNNs, where re-explanation may remove selected [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Self-inconsistency patterns of four representative SI-GNNs on the BA-2MOTIFS dataset. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Correlation analysis between edge score variation and local context variation on SMGNN [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Cases of SD on the BENZENE dataset. Top: first-pass explanation. Middle: second-pass [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Selecting η via adapted predictive performance on SMGNN. η is selected on validation data; test results are shown. Other SI-GNNs are reported in Section D. 16 cases, with gains of up to 8%. We provide case studies in [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Selecting η via adapted predictive performance on GAT. ACC: 7.9% AUC: 0.1% ACC: 3.4% AUC: 1.5% ACC: 5.7% AUC: 1.5% ACC: 1.7% AUC: 0.5% [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Selecting η via adapted predictive performance on CAL. Besides, fidelity metrics can be biased toward redundant explanations. Since FID− evaluates whether the selected subgraph alone preserves the prediction, explanations containing many redundant edges can obtain low FID even if they are not concise or plausible. For example, GAT on MUTAGENICITY achieves a very low FID of 0.37%, but its sparsity is as hig… view at source ↗
Figure 8
Figure 8. Figure 8: Selecting η via adapted predictive performance on GSAT. 0.0 0.25 0.5 0.75 1.0 UNIMP = 0.007 0.0 0.25 0.5 0.75 1.0 = 0.036 0.0 0.25 0.5 0.75 1.0 = 0.072 0.0 0.25 0.5 0.75 1.0 = 0.242 0.0 0.25 0.5 0.75 1.0 = 0.332 0.0 0.25 0.5 0.75 1.0 =0.1 IMP GAT (RAW) GAT+SD (RAW) (SD) = 0.000 0.0 0.25 0.5 0.75 1.0 =0.5 = 0.000 0.0 0.25 0.5 0.75 1.0 =1 = 0.000 0.0 0.25 0.5 0.75 1.0 =5 = 0.001 0.0 0.25 0.5 0.75 1.0 =10 = 0… view at source ↗
Figure 9
Figure 9. Figure 9: Edge-score distributions of GAT under different [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Edge-score distributions of CAL under different [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Edge-score distributions of SMGNN under different [PITH_FULL_IMAGE:figures/full_fig_p022_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Edge-score distributions of GSAT under different [PITH_FULL_IMAGE:figures/full_fig_p023_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Edge-score distributions of GAT under different [PITH_FULL_IMAGE:figures/full_fig_p023_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Edge-score distributions of CAL under different [PITH_FULL_IMAGE:figures/full_fig_p024_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Edge-score distributions of SMGNN under different [PITH_FULL_IMAGE:figures/full_fig_p024_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Edge-score distributions of GSAT under different [PITH_FULL_IMAGE:figures/full_fig_p025_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Edge-score distributions of GAT under different [PITH_FULL_IMAGE:figures/full_fig_p025_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Edge-score distributions of CAL under different [PITH_FULL_IMAGE:figures/full_fig_p025_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Edge-score distributions of SMGNN under different [PITH_FULL_IMAGE:figures/full_fig_p026_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Edge-score distributions of GSAT under different [PITH_FULL_IMAGE:figures/full_fig_p026_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Edge-score distributions of GAT under different [PITH_FULL_IMAGE:figures/full_fig_p026_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Edge-score distributions of CAL under different [PITH_FULL_IMAGE:figures/full_fig_p027_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Edge-score distributions of SMGNN under different [PITH_FULL_IMAGE:figures/full_fig_p027_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Edge-score distributions of GSAT under different [PITH_FULL_IMAGE:figures/full_fig_p027_24.png] view at source ↗
read the original abstract

Recent work has observed that explanations produced by Self-Interpretable Graph Neural Networks (SI-GNNs) can be self-inconsistent: when the model is reapplied to its own explanatory graph subset, it may produce a different explanation. However, why self-inconsistency arises remains poorly understood. In this work, we first identify re-explanation-induced context perturbation as the direct cause of score variation. We then introduce a latent signal assignment hypothesis to explain why only some edges are sensitive to this perturbation, and analyze how conciseness regularization affects latent signal assignment. Given that self-inconsistent edges do not provide stable evidence for the model's prediction, we propose Self-Denoising (SD), a model-agnostic and training-free post-processing strategy that calibrates explanations with only one additional forward pass. Experiments across representative SI-GNN frameworks, backbone architectures, and benchmark datasets support our hypothesis and show that SD consistently improves explanation quality while adding only about 4--6\% computational overhead in practice.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that self-inconsistency in explanations from Self-Interpretable Graph Neural Networks (SI-GNNs) arises due to re-explanation-induced context perturbation. It introduces a latent signal assignment hypothesis to account for why certain edges are sensitive to this perturbation, analyzes the role of conciseness regularization in modulating such assignments, and concludes that self-inconsistent edges lack stable evidence for the original prediction. Building on this, the authors propose Self-Denoising (SD), a model-agnostic and training-free post-processing method that calibrates explanations using only one additional forward pass, with experiments across SI-GNN frameworks, backbone architectures, and benchmark datasets purportedly showing consistent improvements in explanation quality at 4-6% added computational cost.

Significance. If the hypothesis is correct and SD demonstrably improves explanation fidelity without introducing re-explanation artifacts, the work would offer a practical, low-overhead contribution to GNN explainability. The training-free and model-agnostic nature, combined with the modest overhead, would make it readily adoptable. The analysis of context perturbation and regularization effects could also inform future SI-GNN design.

major comments (2)
  1. [Abstract and hypothesis section] Abstract and the section introducing the latent signal assignment hypothesis: the load-bearing claim that self-inconsistent edges 'do not provide stable evidence for the model's prediction' (and thus that adjusting them via SD improves fidelity) is inferred directly from score variation on the re-explained subgraph. No independent, ground-truth-free probe (such as measuring original-graph prediction agreement after masking the identified edges) is described to rule out the alternative that the inconsistency reflects valid context-dependent signals or re-explanation artifacts.
  2. [Experiments] Experimental section: the abstract asserts that experiments 'support our hypothesis and show that SD consistently improves explanation quality,' yet the manuscript provides no details on the precise fidelity metrics used, statistical tests, controls for re-explanation artifacts, or ablation isolating the effect of the latent signal assignment step from simple score thresholding.
minor comments (2)
  1. [Hypothesis and analysis] The definition and quantification of 'latent signal assignment' should be formalized (e.g., via an equation) to make the analysis of conciseness regularization reproducible.
  2. [Experiments] Computational overhead claims (4-6%) would benefit from a breakdown showing the cost of the additional forward pass versus any post-processing steps.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract and hypothesis section] Abstract and the section introducing the latent signal assignment hypothesis: the load-bearing claim that self-inconsistent edges 'do not provide stable evidence for the model's prediction' (and thus that adjusting them via SD improves fidelity) is inferred directly from score variation on the re-explained subgraph. No independent, ground-truth-free probe (such as measuring original-graph prediction agreement after masking the identified edges) is described to rule out the alternative that the inconsistency reflects valid context-dependent signals or re-explanation artifacts.

    Authors: We acknowledge that the claim is primarily supported by the observed score variation under re-explanation-induced context perturbation, as described in our latent signal assignment hypothesis. While this variation indicates instability in the assigned signals, we agree that an independent validation would strengthen the argument against alternatives such as valid context-dependent signals. In the revised manuscript, we will add a new analysis that measures the original model's prediction change on the full graph after masking the self-inconsistent edges identified by our method. This ground-truth-free probe will help confirm that these edges do not provide stable evidence and will be reported alongside the existing results. revision: yes

  2. Referee: [Experiments] Experimental section: the abstract asserts that experiments 'support our hypothesis and show that SD consistently improves explanation quality,' yet the manuscript provides no details on the precise fidelity metrics used, statistical tests, controls for re-explanation artifacts, or ablation isolating the effect of the latent signal assignment step from simple score thresholding.

    Authors: We agree that the experimental section would benefit from greater detail to support the claims. In the revision, we will expand it to: (1) explicitly define the fidelity metrics (e.g., how prediction agreement and explanation quality are quantified on the original graph to avoid re-explanation bias); (2) report statistical significance using paired tests such as the Wilcoxon signed-rank test across multiple random seeds and datasets; (3) describe controls for re-explanation artifacts, including comparisons against random perturbations of the same magnitude; and (4) include an ablation study that isolates the latent signal assignment step from simple score thresholding. These additions will clarify the specific contributions of our hypothesis and method. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation remains independent of inputs

full rationale

The paper observes self-inconsistency via re-explanation, introduces a hypothesis for its cause (context perturbation and latent signal assignment), and defines Self-Denoising as a post-processing step that uses one additional forward pass to identify and calibrate inconsistent edges. This does not reduce any claimed prediction or result to a fitted quantity or self-citation by construction; the method is explicitly algorithmic and its benefits are asserted via external experiments on multiple frameworks and datasets rather than being equivalent to the initial observations. No load-bearing step matches the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the new hypothesis that only certain edges carry latent signals sensitive to context perturbation and on the assumption that inconsistent edges can be safely removed without harming predictive fidelity. No free parameters are described in the abstract; the work inherits standard GNN and explanation-method assumptions.

axioms (1)
  • domain assumption Standard assumptions underlying self-interpretable GNN frameworks and existing explanation evaluation metrics
    The paper builds directly on representative SI-GNN frameworks without re-deriving their foundations.
invented entities (1)
  • latent signal assignment no independent evidence
    purpose: To explain differential sensitivity of edges to re-explanation context perturbation
    Introduced as a hypothesis to account for why only some edges change scores under perturbation.

pith-pipeline@v0.9.0 · 5476 in / 1337 out tokens · 36613 ms · 2026-05-11T02:10:27.643338+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

  1. [1]

    Self-consistency improves the trustworthiness of self-interpretable gnns

    Wenxin Tai, Ting Zhong, Goce Trajcevski, and Fan Zhou. Self-consistency improves the trustworthiness of self-interpretable gnns. InICLR, 2026

  2. [2]

    Graph attention networks

    Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph attention networks. InICLR, 2018

  3. [3]

    Causal attention for interpretable and generalizable graph classification

    Yongduo Sui, Xiang Wang, Jiancan Wu, Min Lin, Xiangnan He, and Tat-Seng Chua. Causal attention for interpretable and generalizable graph classification. InKDD, 2022

  4. [4]

    Beyond topological self-explainable gnns: A formal explainability perspective

    Steve Azzolin, Sagar Malhotra, Andrea Passerini, and Stefano Teso. Beyond topological self-explainable gnns: A formal explainability perspective. InICML, 2025

  5. [5]

    Interpretable and generalizable graph learning via stochastic attention mechanism

    Siqi Miao, Mia Liu, and Pan Li. Interpretable and generalizable graph learning via stochastic attention mechanism. InICML, 2025

  6. [6]

    Gnnexplainer: Generating explanations for graph neural networks.NeurIPS, 2019

    Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec. Gnnexplainer: Generating explanations for graph neural networks.NeurIPS, 2019

  7. [7]

    Parameterized explainer for graph neural network.NeurIPS, 2020

    Dongsheng Luo, Wei Cheng, Dongkuan Xu, Wenchao Yu, Bo Zong, Haifeng Chen, and Xiang Zhang. Parameterized explainer for graph neural network.NeurIPS, 2020

  8. [8]

    Towards inductive and efficient explanations for graph neural networks.IEEE TPAMI, 2024

    Dongsheng Luo, Tianxiang Zhao, Wei Cheng, Dongkuan Xu, Feng Han, Wenchao Yu, Xiao Liu, Haifeng Chen, and Xiang Zhang. Towards inductive and efficient explanations for graph neural networks.IEEE TPAMI, 2024

  9. [9]

    Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence, 2019

    Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence, 2019

  10. [10]

    From model explanation to data misinterpretation: Uncovering the pitfalls of post hoc explainers in business research.Arxiv, 2024

    Ronilo Ragodos, Tong Wang, Lu Feng, et al. From model explanation to data misinterpretation: Uncovering the pitfalls of post hoc explainers in business research.Arxiv, 2024

  11. [11]

    Redundancy undermines the trustworthiness of self-interpretable gnns

    Wenxin Tai, Ting Zhong, Goce Trajcevski, and Fan Zhou. Redundancy undermines the trustworthiness of self-interpretable gnns. InICML, 2025

  12. [12]

    Attention is all you need.NeurIPS, 2017

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.NeurIPS, 2017

  13. [13]

    Interpretation and identification of causal mediation.Psychological Methods, 2014

    Judea Pearl. Interpretation and identification of causal mediation.Psychological Methods, 2014

  14. [14]

    Discovering invariant rationales for graph neural networks

    Yingxin Wu, Xiang Wang, An Zhang, Xiangnan He, and Tat-Seng Chua. Discovering invariant rationales for graph neural networks. InICLR, 2022

  15. [15]

    Graph neural networks including sparse interpretability.ArXiv, 2020

    Chris Lin, Gerald J Sun, Krishna C Bulusu, Jonathan R Dry, and Marylens Hernandez. Graph neural networks including sparse interpretability.ArXiv, 2020

  16. [16]

    Graph information bottleneck for subgraph recognition

    Junchi Yu, Tingyang Xu, Yu Rong, Yatao Bian, Junzhou Huang, and Ran He. Graph information bottleneck for subgraph recognition. InICLR, 2021

  17. [17]

    Improving subgraph recognition with variational graph information bottleneck

    Junchi Yu, Jie Cao, and Ran He. Improving subgraph recognition with variational graph information bottleneck. InCVPR, pages 19396–19405, 2022

  18. [18]

    On explainability of graph neural networks via subgraph explorations

    Hao Yuan, Haiyang Yu, Jie Wang, Kang Li, and Shuiwang Ji. On explainability of graph neural networks via subgraph explorations. InICML, 2021

  19. [19]

    Protgnn: Towards self-explaining graph neural networks

    Zaixi Zhang, Qi Liu, Hao Wang, Chengqiang Lu, and Cheekong Lee. Protgnn: Towards self-explaining graph neural networks. InAAAI, 2022

  20. [20]

    A comprehensive survey on self-interpretable neural networks.Proceedings of the IEEE, 2025

    Yang Ji, Ying Sun, Yuting Zhang, Zhigaoyuan Wang, Yuanxin Zhuang, Zheng Gong, Dazhong Shen, Chuan Qin, Hengshu Zhu, and Hui Xiong. A comprehensive survey on self-interpretable neural networks.Proceedings of the IEEE, 2025

  21. [21]

    Discovery of a structural class of antibiotics with explainable deep learning.Nature, 2024

    Felix Wong, Erica J Zheng, Jacqueline A Valeri, Nina M Donghia, Melis N Anahtar, Satotaka Omori, Alicia Li, Andres Cubillos-Ruiz, Aarti Krishnan, Wengong Jin, et al. Discovery of a structural class of antibiotics with explainable deep learning.Nature, 2024. 10

  22. [22]

    Quantitative evaluation of explainable graph neural networks for molecular property prediction.Patterns, 2022

    Jiahua Rao, Shuangjia Zheng, Yutong Lu, and Yuedong Yang. Quantitative evaluation of explainable graph neural networks for molecular property prediction.Patterns, 2022

  23. [23]

    Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann

    Christopher Morris, Nils M. Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. Tudataset: A collection of benchmark datasets for learning with graphs. InICML Workshop, 2020

  24. [25]

    Understanding attention and general- ization in graph neural networks

    Boris Knyazev, Graham W Taylor, and Mohamed Amer. Understanding attention and general- ization in graph neural networks. InNeurIPS, 2019

  25. [26]

    Gcan: Graph-aware co-attention networks for explainable fake news detection on social media

    Yi Ju Lu and Cheng Te Li. Gcan: Graph-aware co-attention networks for explainable fake news detection on social media. InACL, 2020

  26. [27]

    Interpretable prototype-based graph information bottleneck

    Sangwoo Seo, Sungwon Kim, and Chanyoung Park. Interpretable prototype-based graph information bottleneck. InNeurIPS, 2023

  27. [28]

    How interpretable are interpretable graph neural networks? InICML, 2024

    Yongqiang Chen, Yatao Bian, Bo Han, and James Cheng. How interpretable are interpretable graph neural networks? InICML, 2024

  28. [29]

    Towards self-explainable graph neural network

    Enyan Dai and Suhang Wang. Towards self-explainable graph neural network. InCIKM, 2021

  29. [30]

    Kergnns: Interpretable graph neural networks with graph kernels

    Aosong Feng, Chenyu You, Shiqiang Wang, and Leandros Tassiulas. Kergnns: Interpretable graph neural networks with graph kernels. InAAAI, 2022

  30. [31]

    Graph learning.ArXiv, 2025

    Feng Xia, Ciyuan Peng, Jing Ren, Falih Gozi Febrinanto, Renqiang Luo, Vidya Saikrishna, Shuo Yu, and Xiangjie Kong. Graph learning.ArXiv, 2025

  31. [32]

    Gnn explana- tions that do not explain and how to find them

    Steve Azzolin, Stefano Teso, Bruno Lepri, Andrea Passerini, and Sagar Malhotra. Gnn explana- tions that do not explain and how to find them. InICLR, 2026

  32. [33]

    John Wiley & Sons, 2013

    Patrick Billingsley.Convergence of probability measures. John Wiley & Sons, 2013

  33. [34]

    Courier Corporation, 2002

    Torgny Lindvall.Lectures on the coupling method. Courier Corporation, 2002

  34. [35]

    Categorical reparameterization with gumbel-softmax

    Eric Jang, Shixiang Gu, and Ben Poole. Categorical reparameterization with gumbel-softmax. InICLR, 2017

  35. [36]

    How powerful are graph neural networks? InICLR, 2019

    Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? InICLR, 2019

  36. [37]

    Moghaddam, and Roger Wattenhofer

    Lukas Faber, Amin K. Moghaddam, and Roger Wattenhofer. When comparing to ground truth is wrong: On evaluating gnn explanation methods. InKDD, 2021

  37. [38]

    Graphframex: Towards systematic evaluation of explainability methods for graph neural networks

    Kenza Amara, Zhitao Ying, Zitao Zhang, Zhichao Han, Yang Zhao, Yinan Shan, Ulrik Bran- des, Sebastian Schemm, and Ce Zhang. Graphframex: Towards systematic evaluation of explainability methods for graph neural networks. InLOG, 2022

  38. [39]

    Explainability in graph neural networks: A taxonomic survey.IEEE TPAMI, 2022

    Hao Yuan, Haiyang Yu, Shurui Gui, and Shuiwang Ji. Explainability in graph neural networks: A taxonomic survey.IEEE TPAMI, 2022

  39. [40]

    Reconsidering faithfulness in regular, self-explainable and domain invariant gnns

    Steve Azzolin, Antonio Longa, Stefano Teso, Andrea Passerini, et al. Reconsidering faithfulness in regular, self-explainable and domain invariant gnns. InICLR, 2025

  40. [41]

    Inductive representation learning on large graphs.NeurIPS, 2017

    Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs.NeurIPS, 2017

  41. [42]

    Sdplib 1.2, a library of semidefinite program- ming test problems.Optimization Methods and Software, 11(1-4):683–690, 1999

    Xavier Bresson and Thomas Laurent. Residual gated graph convnets.arXiv preprint arXiv:1711.07553, 2017. 11 A Extended Related Work Unlike post-hoc methods that explain GNNs after training, SI-GNNs integrate explanation generation into the learning pipeline to ensure the model’s rationale aligns with its prediction. Most of them follow a two-stage pipeline...