B-cos GNNs: Faithful Explanations through Dynamic Linearity
Pith reviewed 2026-05-20 07:31 UTC · model grok-4.3
The pith
B-cos GNNs make graph neural network predictions decompose exactly into per-node and per-feature contributions via an input-dependent linear map.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
B-cos GNNs are an inherently explainable class of graph neural networks whose predictions decompose exactly into per-node, per-feature contributions via a single input-dependent linear map. The models use linear sum-based aggregation and replace non-linear message and update functions with B-cos transforms. This induces meaningful, task-specific weight-input alignment that is directly accessible through the model's dynamic linearity. Instance-level explanations therefore follow from a single forward and backward pass, requiring no auxiliary explainer, modified learning objective, or perturbation procedure.
What carries the argument
The input-dependent linear map created by B-cos transforms, which replaces nonlinear operations to enforce weight-input alignment and enable exact additive decomposition of each prediction.
If this is right
- Instance-level explanations become available after a single forward and backward pass through the model.
- No separate explainer network, changed training objective, or sampling-based perturbation is needed.
- Explanations run orders of magnitude faster than post-hoc baselines while reaching state-of-the-art quality.
- When the model is instantiated as a GIN, only small losses in predictive accuracy occur in exchange for the explainability gains.
- The same decomposition property holds across diverse synthetic and real-world graph benchmarks.
Where Pith is reading between the lines
- The same replacement of nonlinear layers by B-cos transforms could be tried in non-graph architectures such as MLPs or transformers to obtain built-in explanations.
- Production systems using graph models could drop separate explanation pipelines and thereby reduce both latency and maintenance overhead.
- If the alignment effect generalizes, it might offer a route to verifiable explanations in regulated domains like molecular property prediction.
Load-bearing premise
Replacing non-linear message and update functions with B-cos transforms will produce task-specific alignment between weights and inputs while keeping enough predictive power for the target tasks.
What would settle it
On a synthetic graph dataset with known ground-truth important nodes and features, check whether the per-node per-feature contributions from the linear map correctly recover those ground-truth elements and whether their sum exactly equals the model's output score.
Figures
read the original abstract
We introduce B-cos GNNs, an inherently explainable class of graph neural networks whose predictions decompose exactly into per-node, per-feature contributions via a single input-dependent linear map. B-cos GNNs use linear (sum-based) aggregation and replace non-linear message and update functions with B-cos transforms. This induces meaningful, task-specific weight-input alignment that is directly accessible through the model's dynamic linearity. Instance-level explanations follow from a single forward and backward pass, requiring no auxiliary explainer, modified learning objective, or perturbation procedure. Instantiated as a GIN, our approach trades small losses in predictive accuracy for state-of-the-art explainability across diverse synthetic and real-world benchmarks, producing explanations orders of magnitude faster than post-hoc baselines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces B-cos GNNs as an inherently explainable architecture for graph neural networks. By replacing non-linear message and update functions with B-cos transforms while retaining linear (sum) aggregation, the model ensures that predictions decompose exactly into per-node, per-feature contributions through a single input-dependent linear map. This enables instance-level explanations from one forward and backward pass without auxiliary explainers or modified objectives. The approach is instantiated as a GIN variant and evaluated on synthetic and real-world benchmarks, reporting competitive predictive accuracy alongside state-of-the-art explainability metrics and substantially faster explanation generation.
Significance. If the exact decomposition property holds under multi-layer composition, the work would represent a meaningful contribution to inherently interpretable GNNs by providing faithful, efficient explanations directly from the model's dynamic linearity. The emphasis on task-specific weight-input alignment without post-hoc machinery or retraining objectives could influence future designs in explainable graph ML, particularly where computational efficiency of explanations is critical.
major comments (2)
- [§3.2, §4.1] §3.2 and §4.1: The central claim that B-cos GNN predictions decompose exactly into a single input-dependent linear map on the original node features requires an explicit derivation or telescoping expansion for the multi-hop case. Each B-cos layer produces an input-dependent weight vector, but these weights become inputs to subsequent layers; it is not immediate that the composition across hops (with sum aggregation) remains strictly linear in the initial features without additional cancellation of normalizations. An explicit expansion for the GIN instantiation or a proof sketch would strengthen the claim.
- [§5.2, Table 2] §5.2, Table 2: The reported explainability metrics (e.g., fidelity and sparsity) show improvements over post-hoc baselines, but the ablation isolating the contribution of the B-cos alignment versus the linear aggregation alone is missing. Without this, it is difficult to attribute the gains specifically to the dynamic linearity property rather than the architectural simplification.
minor comments (2)
- [§3.1, §4] Notation for the B-cos transform (e.g., the scaling factor and normalization) is introduced in §3.1 but used inconsistently in the GNN layer equations in §4; a single consolidated definition would improve readability.
- [§5] The experimental section references 'diverse synthetic and real-world benchmarks' but does not list the exact datasets or splits in the main text; moving the full list from the appendix to a table in §5 would aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on the decomposition property and the need for targeted ablations. We address each point below and have revised the manuscript to incorporate explicit derivations and additional experiments where appropriate.
read point-by-point responses
-
Referee: [§3.2, §4.1] §3.2 and §4.1: The central claim that B-cos GNN predictions decompose exactly into a single input-dependent linear map on the original node features requires an explicit derivation or telescoping expansion for the multi-hop case. Each B-cos layer produces an input-dependent weight vector, but these weights become inputs to subsequent layers; it is not immediate that the composition across hops (with sum aggregation) remains strictly linear in the initial features without additional cancellation of normalizations. An explicit expansion for the GIN instantiation or a proof sketch would strengthen the claim.
Authors: We agree that an explicit derivation clarifies the multi-hop composition. In the revised manuscript we have added a proof sketch (new Appendix B) that performs the telescoping expansion for the GIN instantiation. The argument proceeds by induction: each B-cos layer yields an input-dependent linear map whose weights are functions of the current node features; because aggregation remains a sum, the composition across layers remains exactly linear in the original features. The B-cos normalization terms cancel in the overall expression precisely because they are applied element-wise after the linear aggregation, so no additional cancellation assumptions are required beyond the definition of the B-cos transform. revision: yes
-
Referee: [§5.2, Table 2] §5.2, Table 2: The reported explainability metrics (e.g., fidelity and sparsity) show improvements over post-hoc baselines, but the ablation isolating the contribution of the B-cos alignment versus the linear aggregation alone is missing. Without this, it is difficult to attribute the gains specifically to the dynamic linearity property rather than the architectural simplification.
Authors: We acknowledge that separating the effect of B-cos alignment from linear aggregation strengthens attribution. We have added a controlled ablation (new paragraph in §5.2 and updated Table 2) that replaces the B-cos transforms with standard ReLU MLPs while retaining sum aggregation. The results show that linear aggregation alone yields only marginal gains in fidelity and sparsity, whereas the full B-cos GNN recovers the reported state-of-the-art explainability metrics. This indicates that the dynamic weight-input alignment induced by B-cos, rather than linearity per se, drives the observed improvements. revision: yes
Circularity Check
No significant circularity; explanations follow from architectural substitution
full rationale
The paper defines B-cos GNNs explicitly by replacing non-linear message/update functions with B-cos transforms while retaining linear sum aggregation. The claimed exact decomposition of predictions into per-node per-feature contributions via one input-dependent linear map is a direct, by-construction consequence of this substitution and the resulting dynamic linearity. No equations or claims reduce a 'prediction' to a fitted parameter, no self-citation chain bears the central load, and no uniqueness theorem is imported to force the result. The multi-layer composition is asserted to preserve the single-map property through the design, but this remains an explicit modeling choice rather than a circular reduction to inputs. The derivation is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
predictions decompose exactly into per-node, per-feature contributions via a single input-dependent linear map... B-cos transforms... dynamic linearity... W(θ1,...,θL)(x) = W̃θL(aL)⋯W̃θ1(a1)
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
replace non-linear message and update functions with B-cos transforms... sum-based aggregation
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
C. Agarwal, O. Queen, H. Lakkaraju, and M. Zitnik. Evaluating explainability for graph neural networks.Scientific Data, 10(1):144, 2023
work page 2023
- [2]
- [3]
-
[4]
Reproducing: Parameterized explainer for graph neural network
Anonymous. Reproducing: Parameterized explainer for graph neural network. OpenReview ML Reproducibility Challenge, 2021
work page 2021
-
[5]
B-cos/B-cos-v2: Official PyTorch implementation of improved B-cos models
B-cos-v2. B-cos/B-cos-v2: Official PyTorch implementation of improved B-cos models. https://github.com/B-cos/B-cos-v2, 2026
work page 2026
-
[6]
S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation.PLOS ONE, 10(7), 2015
work page 2015
-
[7]
F. Baldassarre and H. Azizpour. Explainability techniques for graph convolutional networks. In ICML Workshop on Learning and Reasoning with Graph-Structured Representations, 2019
work page 2019
- [8]
- [9]
-
[10]
C. Coupette and J. Vreeken. RINGS: Rationalizing information for graph encoders. In International Conference on Artificial Intelligence and Statistics (AISTATS), 2025
work page 2025
-
[11]
V . P. Dwivedi, C. K. Joshi, T. Laurent, Y . Bengio, and X. Bresson. Benchmarking graph neural networks.Journal of Machine Learning Research (JMLR), 24(43):1–48, 2023
work page 2023
- [12]
-
[13]
W. Hu, M. Fey, M. Zitnik, Y . Dong, H. Ren, B. Liu, M. Catasta, and J. Leskovec. Open graph benchmark: Datasets for machine learning on graphs. InAdvances in Neural Information Processing Systems (NeurIPS), 2020
work page 2020
-
[14]
W. Hu, B. Liu, J. Gomes, M. Zitnik, P. Liang, V . Pande, and J. Leskovec. Strategies for pre-training graph neural networks. InInternational Conference on Learning Representations (ICLR), 2020
work page 2020
-
[15]
A. Jacovi and Y . Goldberg. Towards faithfully interpretable nlp systems: How should we define and evaluate faithfulness? InProceedings of the 58th annual meeting of the association for computational linguistics, pages 4198–4205, 2020
work page 2020
-
[16]
B. Knyazev, G. W. Taylor, and M. Amer. Understanding attention and generalization in graph neural networks.Advances in neural information processing systems, 32, 2019
work page 2019
-
[17]
R. Lam, A. Sanchez-Gonzalez, M. Willson, P. Wirnsberger, M. Fortunato, F. Alet, S. Ravuri, T. Ewalds, Z. Eaton-Rosen, W. Hu, et al. Learning skillful medium-range global weather forecasting.Science, 382(6677):1416–1421, 2023
work page 2023
-
[18]
W. Li, X. Wang, and Y . Zhang. Graph neural network explanations are fragile. InInternational Conference on Machine Learning (ICML), 2024. Corrected author list from Muller to Li et al. 10
work page 2024
-
[19]
D. Luo, W. Cheng, D. Xu, W. Yu, B. Zong, H. Chen, and X. Zhang. Parameterized explainer for graph neural network. InAdvances in Neural Information Processing Systems (NeurIPS), 2020
work page 2020
-
[20]
S. Miao, M. Luo, Y . Liu, and P. Li. Interpretable and generalizable graph learning via stochastic attention mechanism. InInternational Conference on Machine Learning (ICML), 2022
work page 2022
- [21]
-
[22]
P. E. Pope, S. Kolouri, M. Rostami, C. E. Martin, and H. Hoffmann. Explainability methods for graph convolutional neural networks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019
work page 2019
-
[23]
C. Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence, 1(5):206–215, 2019
work page 2019
- [24]
- [25]
-
[26]
K. Simonyan, A. Vedaldi, and A. Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. InInternational Conference on Learning Representations (ICLR) Workshop, 2014
work page 2014
-
[27]
J. M. Stokes, K. Yang, K. Swanson, W. Jin, A. Cubillos-Ruiz, et al. A deep learning approach to antibiotic discovery.Cell, 180(4):688–702, 2020
work page 2020
-
[28]
M. Sundararajan, A. Taly, and Q. Yan. Axiomatic attribution for deep networks. InInternational Conference on Machine Learning (ICML), 2017
work page 2017
-
[29]
P. Veliˇckovi´c. Everything is connected: Graph neural networks.Current Opinion in Structural Biology, 79:102538, 2023
work page 2023
-
[30]
N. Wale, I. A. Watson, and G. Karypis. Comparison of descriptor spaces for chemical compound retrieval and classification.Knowledge and Information Systems, 14(3):347–375, 2008
work page 2008
- [31]
- [32]
-
[33]
Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, and V . Pande. Moleculenet: a benchmark for molecular machine learning.Chemical science, 9(2):513–530, 2018
work page 2018
-
[34]
K. Xu, W. Hu, J. Leskovec, and S. Jegelka. How powerful are graph neural networks? In International Conference on Learning Representations (ICLR), 2019
work page 2019
-
[35]
R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec. Graph convolu- tional neural networks for web-scale recommender systems. InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018
work page 2018
-
[36]
Z. Ying, D. Bourgeois, J. You, M. Zitnik, and J. Leskovec. GNNExplainer: Generating explanations for graph neural networks. InAdvances in Neural Information Processing Systems (NeurIPS), 2019
work page 2019
- [37]
-
[38]
H. Yuan, J. Tang, X. Hu, and S. Ji. On explainability of graph neural networks via subgraph explorations. InInternational Conference on Machine Learning (ICML), 2021
work page 2021
-
[39]
H. Yuan, H. Yu, S. Gui, and S. Ji. Explainability in graph neural networks: A taxonomic survey. IEEE TPAMI, 2022. 12 A Broader Impact We facilitate improved interpretability for GNNs, which can positively impact application outcomes through increased transparency and data understanding beyond pure prediction. At the same time, interpretable models may be ...
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.