CIExplainer++: Generating Causal and Interpretable Explanations for Graph Neural Networks

Cl\'audia Soares; Francisco Caldas; Ruben Belo; Sahil Satish Kumar

arxiv: 2606.20747 · v1 · pith:Z3QGEEWUnew · submitted 2026-06-17 · 💻 cs.LG · stat.OT

CIExplainer++: Generating Causal and Interpretable Explanations for Graph Neural Networks

Francisco Caldas , Sahil Satish Kumar , Ruben Belo , Cl\'audia Soares This is my paper

Pith reviewed 2026-06-26 20:35 UTC · model grok-4.3

classification 💻 cs.LG stat.OT

keywords explainable AIgraph neural networkscausal inferencesubgraph explanationspotential outcomes frameworknatural language explanations

0 comments

The pith

CIExplainer identifies the subgraph with the highest causal effects on GNN predictions using the Potential Outcome Framework.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CIExplainer, a perturbation-based method for explaining GNNs that relies on the potential outcome framework to find subgraphs with the strongest causal influence on the model's output. It also presents G2TeXplainer to convert those subgraphs into natural language explanations covering both features and relations. This approach matters to readers because it seeks to distinguish genuine causal drivers from mere correlations in graph-based models. The evaluations cover multiple GNN architectures and datasets to test the method.

Core claim

CIExplainer identifies the subgraph with the highest causal effects on GNN predictions using the Potential Outcome Framework. To bridge subgraph explanations with human interpretability, G2TeXplainer transforms causal subgraphs into natural language explanations that capture both feature-level and relational information.

What carries the argument

The Potential Outcome Framework applied via perturbations to isolate causal subgraph effects on GNN predictions.

If this is right

Explanations are based on causal effects measured through the potential outcomes framework.
The method is tested on GCN, GraphSAGE, GAT, and GIN architectures.
Causal subgraphs are converted to natural language for interpretability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Causal explanations may help identify biases in GNN training data or structure.
The method could be adapted for other types of graph models or tasks.
It opens the possibility of using causal subgraph analysis to improve model robustness.

Load-bearing premise

That the potential outcome framework can be directly applied to graph data via perturbations to isolate causal subgraph effects without additional unstated assumptions about interference, counterfactual definition, or hidden variables in the graph structure.

What would settle it

Observing that the subgraphs found by CIExplainer do not lead to larger changes in GNN predictions compared to those found by non-causal methods when the same perturbation is applied across multiple datasets.

Figures

Figures reproduced from arXiv: 2606.20747 by Cl\'audia Soares, Francisco Caldas, Ruben Belo, Sahil Satish Kumar.

**Figure 1.** Figure 1: Diagram of the CIExplainer proposed pipeline. Using as example the explanation of a node classification task, [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Two examples of the explained subgraph alongside the textual description, for the BA-2Motifs dataset. [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Comparison of aggregation method results for Causal Effect. The values are calculated across all classification [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison of aggregation method results for Causal Effect. The values are calculated across all classification [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of aggregation method results for Causal Effect. The values are calculated across all classification [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗

**Figure 6.** Figure 6: Comparison of sampling distributions results for Causal Effect. The values are calculated across all classification [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗

**Figure 7.** Figure 7: Training Loss for the node classification task. Loss is presented for the training and validation sets. Across the [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗

**Figure 8.** Figure 8: Training accuracy for the graph classification task. Accuracy is presented for the training and validation sets. [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗

**Figure 9.** Figure 9: Distribution of LLM-as-judge Node Fidelity scores for different graph types. Each cell shows the number of [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗

**Figure 10.** Figure 10: Distribution of LLM-as-judge Structural scores for different graph types. Each cell shows the number of [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗

read the original abstract

Explainable Artificial Intelligence aims to make black-box models more trustworthy by presenting, in a human-understandable manner, the elements that lead to the model's output. This involves both (i) identifying components and connections with genuine causal influence on outputs and (ii) translating such structures into an interpretable representation. For the former, we introduce CIExplainer, a novel perturbation-based method grounded in causal inference for explaining Graph Neural Networks (GNNs). CIExplainer identifies the subgraph with the highest causal effects on GNN predictions using the Potential Outcome Framework. We evaluate and compare CIExplainer on various GNN architectures (GCN, GraphSAGE, GAT, GIN) and datasets. To bridge subgraph explanations with human interpretability, we further propose G2TeXplainer, a method that transforms causal subgraphs into natural language explanations that capture both feature-level and relational information.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CIExplainer applies potential outcomes to subgraph perturbations for GNNs but leaves the interference problem unaddressed.

read the letter

CIExplainer uses the Potential Outcome Framework on perturbations to pick the subgraph with the biggest causal effect on a GNN prediction, then G2TeXplainer turns that subgraph into natural-language text.

The new piece is the explicit causal-inference framing plus the text-generation step on top of subgraph search. Standard perturbation methods exist; tying them to potential outcomes and then to readable text is the incremental move here. The evaluation plan across GCN, GraphSAGE, GAT, and GIN on multiple datasets is also straightforward and worth seeing in full.

The soft spot is exactly the one flagged in the stress-test note. In any connected graph, altering one subgraph changes the messages that reach neighboring nodes, so the stable-unit-treatment-value assumption is violated by construction. The abstract gives no indication that the authors define counterfactuals to avoid this, adjust for topological confounders, or restrict the method to graphs where interference can be ignored. Without that, the causal-effect numbers rest on an assumption that does not hold for typical GNN message-passing.

This is aimed at people who already work on post-hoc explanations for graph models and want a causal angle. A reader looking for immediately usable code or proven gains over existing methods will not find enough in the abstract.

The idea is coherent enough on its own terms to deserve referee time, even if the causal claims need heavy checking. I would send it out for review.

Referee Report

1 major / 1 minor

Summary. The paper introduces CIExplainer, a perturbation-based explainer for GNNs that identifies the subgraph with maximal causal effect on model predictions via the Potential Outcome Framework. It additionally proposes G2TeXplainer to convert the identified causal subgraphs into natural-language explanations incorporating both node features and relational structure. The approach is evaluated across GCN, GraphSAGE, GAT and GIN architectures on multiple datasets.

Significance. A sound causal identification procedure that respects graph-specific dependence structures would strengthen the reliability of subgraph explanations relative to purely correlational perturbation methods. The natural-language translation component could improve human interpretability if the underlying causal subgraphs are correctly recovered.

major comments (1)

[Abstract / Method] Abstract and method description: the central claim that CIExplainer isolates the subgraph with the highest causal effect rests on direct application of the Potential Outcome Framework to graph perturbations. The manuscript provides no indication that counterfactuals are defined in a manner that respects the message-passing dependencies of GNNs or that the Stable Unit Treatment Value Assumption (SUTVA) is maintained; node interference through edges is therefore unaddressed and load-bearing for the causal interpretation.

minor comments (1)

The abstract states that evaluations were performed but reports neither quantitative metrics nor baseline comparisons; adding a concise summary of key results (e.g., fidelity or causal-effect scores) would improve the abstract.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the causal foundations of our approach. We respond to the major comment below.

read point-by-point responses

Referee: [Abstract / Method] Abstract and method description: the central claim that CIExplainer isolates the subgraph with the highest causal effect rests on direct application of the Potential Outcome Framework to graph perturbations. The manuscript provides no indication that counterfactuals are defined in a manner that respects the message-passing dependencies of GNNs or that the Stable Unit Treatment Value Assumption (SUTVA) is maintained; node interference through edges is therefore unaddressed and load-bearing for the causal interpretation.

Authors: We agree that the manuscript does not explicitly discuss the definition of counterfactuals with respect to GNN message-passing dependencies or address potential violations of SUTVA arising from node interference via edges. This is a substantive point for the causal claims. In the revised version we will add a new subsection to the Methods section that (i) formally defines the potential outcomes under subgraph perturbation, (ii) states the maintained assumptions including the approximation that treats the selected subgraph as the treatment unit while holding the remainder of the graph fixed, and (iii) acknowledges that edge-mediated interference may violate SUTVA and is therefore a limitation of the current causal interpretation. We will also update the abstract to reflect this clarification. revision: yes

Circularity Check

0 steps flagged

No circularity detected; derivation applies standard causal framework without self-reduction

full rationale

The provided abstract and description introduce CIExplainer as a perturbation-based method using the Potential Outcome Framework to identify high-causal-effect subgraphs for GNN explanations, followed by G2TeXplainer for natural language translation. No equations, parameter fits, self-citations, or uniqueness theorems are quoted that reduce the central claim to its own inputs by construction. The approach is presented as an application of existing causal inference tools to graph perturbations, remaining self-contained against external benchmarks without load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, background axioms, or new postulated entities.

pith-pipeline@v0.9.1-grok · 5694 in / 1022 out tokens · 32830 ms · 2026-06-26T20:35:04.879424+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 3 canonical work pages

[1]

From nodes to narratives: Explain- ing graph neural networks with llms and graph context

Peyman Baghershahi, Gregoire Fournier, Pranav Nyati, and Sourav Medya. From nodes to narratives: Explain- ing graph neural networks with llms and graph context. arXiv:2508.07117,

Pith/arXiv arXiv
[2]

doi: doi.org/10

ISSN 2666-6758. doi: doi.org/10. 1016/j.xinn.2025.101253. Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs.Advances in neural information processing systems, 30,

arXiv 2025
[3]

doi: 10.1109/TKDE.2022.3187455

ISSN 1041-4347. doi: 10.1109/TKDE.2022.3187455. Guangyin Jin, Qi Wang, Cunchao Zhu, Yanghe Feng, Jincai Huang, and Jiangping Zhou. Addressing crime situation forecasting task with temporal graph convolutional neural network approach. In2020 12th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), pages 474–478,

work page doi:10.1109/tkde.2022.3187455 2022
[4]

Causal inference explanations for graph neural networks

Francisco Caldas Sahil Kumar and Claudia Soares. Causal inference explanations for graph neural networks. In9th Causal Inference Workshop at UAI 2024,

2024
[5]

Avanti Shrikumar, Peyton Greenside, Anna Shcherbina, and Anshul B Kundaje

URL https: //www.rand.org/content/dam/rand/pubs/ research_memoranda/2008/RM670.pdf. Avanti Shrikumar, Peyton Greenside, Anna Shcherbina, and Anshul B Kundaje. Not just a black box: Learning impor- tant features through propagating activation differences. ArXiv, abs/1605.01713,

Pith/arXiv arXiv 2008
[6]

Deep inside convolutional networks: Visualising image classification models and saliency maps.arXiv preprint arXiv:1312.6034,

Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps.arXiv preprint arXiv:1312.6034,

Pith/arXiv arXiv
[7]

gle/Gemma3Report

URL https://goo. gle/Gemma3Report. Filipa Valdeira, Stevo Rackovi´c, Valeria Danalachi, Qiwei Han, and Cláudia Soares. Extreme multilabel classifica- tion for specialist doctor recommendation with implicit feedback and limited patient metadata.arXiv preprint arXiv:2308.11022,

arXiv
[8]

A Comprehensive Survey on Graph Neural Networks,

Chenyu Wang, Zongyu Lin, Xiaochen Yang, Jiao Sun, Mingxuan Yue, and Cyrus Shahabi. Hagen: Homophily- aware graph convolutional recurrent network for crime forecasting. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 4193–4200, 2022a. Xiang Wang, Ying Xin Wu, An Zhang, Fuli Feng, Xiang- nan He, and Tat seng Chua. Reinforce...

work page doi:10.1109/tnnls.2020.2978386 2021
[9]

Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec

doi: 10.1109/TKDE.2022.3148299. Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec. Gnnexplainer: Generating explana- tions for graph neural networks.Advances in neural information processing systems, 32,

work page doi:10.1109/tkde.2022.3148299 2022
[10]

Beyond Yes and No: Improving Zero-Shot Pointwise LLM Rankers via Scoring Fine-Grained Relevance Labels

Honglei Zhuang, Zhen Qin, Kai Hui, Junru Wu, Le Yan, Xuanhui Wang, and Michael Bendersky. Beyond Yes and No: Improving Zero-Shot Pointwise LLM Rankers via Scoring Fine-Grained Relevance Labels . InProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL),

2024
[11]

BA-2motif.On this synthetic motif-based dataset, CIExplainer achieves the highest or second-highest IoU and precision for GCN, GIN, and GraphSAGE backbones

Supplementary Material Francisco Caldas1 Sahil Satish Kumar1 Ruben Belo1 Cláudia Soares1 1NOV A LINCS, NOV A School of Science and Technology, Lisbon, Portugal A GRAPH CLASSIFICATION EXPLAINATION RESULTS Table 5 reports detailed explanation performance for graph classification across backbone architectures and datasets. BA-2motif.On this synthetic motif-b...

2020
[12]

GraphSAGE again has the lowest loss on the test set, and trains Table 6: GNN model configuration per task

The perfect accuracy results are to be expected, since the features of these datasets are specially created to make it easier for the model to learn and predict. GraphSAGE again has the lowest loss on the test set, and trains Table 6: GNN model configuration per task. Node Classification Graph Classification # Layers3 3 Hidden Dimension20 20 Pooling Layer...

arXiv

[1] [1]

From nodes to narratives: Explain- ing graph neural networks with llms and graph context

Peyman Baghershahi, Gregoire Fournier, Pranav Nyati, and Sourav Medya. From nodes to narratives: Explain- ing graph neural networks with llms and graph context. arXiv:2508.07117,

Pith/arXiv arXiv

[2] [2]

doi: doi.org/10

ISSN 2666-6758. doi: doi.org/10. 1016/j.xinn.2025.101253. Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs.Advances in neural information processing systems, 30,

arXiv 2025

[3] [3]

doi: 10.1109/TKDE.2022.3187455

ISSN 1041-4347. doi: 10.1109/TKDE.2022.3187455. Guangyin Jin, Qi Wang, Cunchao Zhu, Yanghe Feng, Jincai Huang, and Jiangping Zhou. Addressing crime situation forecasting task with temporal graph convolutional neural network approach. In2020 12th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), pages 474–478,

work page doi:10.1109/tkde.2022.3187455 2022

[4] [4]

Causal inference explanations for graph neural networks

Francisco Caldas Sahil Kumar and Claudia Soares. Causal inference explanations for graph neural networks. In9th Causal Inference Workshop at UAI 2024,

2024

[5] [5]

Avanti Shrikumar, Peyton Greenside, Anna Shcherbina, and Anshul B Kundaje

URL https: //www.rand.org/content/dam/rand/pubs/ research_memoranda/2008/RM670.pdf. Avanti Shrikumar, Peyton Greenside, Anna Shcherbina, and Anshul B Kundaje. Not just a black box: Learning impor- tant features through propagating activation differences. ArXiv, abs/1605.01713,

Pith/arXiv arXiv 2008

[6] [6]

Deep inside convolutional networks: Visualising image classification models and saliency maps.arXiv preprint arXiv:1312.6034,

Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps.arXiv preprint arXiv:1312.6034,

Pith/arXiv arXiv

[7] [7]

gle/Gemma3Report

URL https://goo. gle/Gemma3Report. Filipa Valdeira, Stevo Rackovi´c, Valeria Danalachi, Qiwei Han, and Cláudia Soares. Extreme multilabel classifica- tion for specialist doctor recommendation with implicit feedback and limited patient metadata.arXiv preprint arXiv:2308.11022,

arXiv

[8] [8]

A Comprehensive Survey on Graph Neural Networks,

Chenyu Wang, Zongyu Lin, Xiaochen Yang, Jiao Sun, Mingxuan Yue, and Cyrus Shahabi. Hagen: Homophily- aware graph convolutional recurrent network for crime forecasting. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 4193–4200, 2022a. Xiang Wang, Ying Xin Wu, An Zhang, Fuli Feng, Xiang- nan He, and Tat seng Chua. Reinforce...

work page doi:10.1109/tnnls.2020.2978386 2021

[9] [9]

Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec

doi: 10.1109/TKDE.2022.3148299. Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec. Gnnexplainer: Generating explana- tions for graph neural networks.Advances in neural information processing systems, 32,

work page doi:10.1109/tkde.2022.3148299 2022

[10] [10]

Beyond Yes and No: Improving Zero-Shot Pointwise LLM Rankers via Scoring Fine-Grained Relevance Labels

Honglei Zhuang, Zhen Qin, Kai Hui, Junru Wu, Le Yan, Xuanhui Wang, and Michael Bendersky. Beyond Yes and No: Improving Zero-Shot Pointwise LLM Rankers via Scoring Fine-Grained Relevance Labels . InProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL),

2024

[11] [11]

BA-2motif.On this synthetic motif-based dataset, CIExplainer achieves the highest or second-highest IoU and precision for GCN, GIN, and GraphSAGE backbones

Supplementary Material Francisco Caldas1 Sahil Satish Kumar1 Ruben Belo1 Cláudia Soares1 1NOV A LINCS, NOV A School of Science and Technology, Lisbon, Portugal A GRAPH CLASSIFICATION EXPLAINATION RESULTS Table 5 reports detailed explanation performance for graph classification across backbone architectures and datasets. BA-2motif.On this synthetic motif-b...

2020

[12] [12]

GraphSAGE again has the lowest loss on the test set, and trains Table 6: GNN model configuration per task

The perfect accuracy results are to be expected, since the features of these datasets are specially created to make it easier for the model to learn and predict. GraphSAGE again has the lowest loss on the test set, and trains Table 6: GNN model configuration per task. Node Classification Graph Classification # Layers3 3 Hidden Dimension20 20 Pooling Layer...

arXiv