Recognition: 2 theorem links
· Lean TheoremCausal Explanations from the Geometric Properties of ReLU Neural Networks
Pith reviewed 2026-05-12 05:06 UTC · model grok-4.3
The pith
ReLU neural networks divide input space into polytopal regions that directly yield accurate causal explanations for decisions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A ReLU network corresponds to a piecewise linear function divided into regions defined by an n-dimensional convex polytope, and this geometric representation can be used to generate causal explanations for the network's behaviour by extracting rules directly from the geometry, which is therefore an accurate reflection of the network's behaviour.
What carries the argument
The partitioning of the input space into convex polytopal regions, inside each of which every output neuron applies a fixed linear function.
Load-bearing premise
The polytopal regions and their associated linear functions inside a ReLU network correspond to causally meaningful factors that can be extracted as human-interpretable explanations without additional assumptions about the task or data.
What would settle it
An input point for which the causal rule derived from its containing polytope and bounding hyperplanes produces a different output value or decision than the actual forward pass of the network.
read the original abstract
Neural networks have proved an effective means of learning control policies for autonomous systems, but these learned policies are difficult to understand due to the black-box nature of neural networks. This lack of interpretability makes safety assurance for such autonomous systems challenging. The fields of eXplainable Artificial Intelligence (XAI) and eXplainable Reinforcement Learning (XRL) aim to interpret the decision making processes of neural networks and autonomous agents, respectively. In particular, work on causal explanations aims to provide "why" and "why not" explanations for why a model made a given decision. However, most of the work on explainability to date utilises a distilled version of the original model. While this distilled policy is interpretable, it necessarily degrades in performance significantly when compared to the original model, and is not guaranteed to be an accurate reflection of the decision making processes in the original model and as such cannot be used to guarantee its safety. Recent work on understanding the geometry of ReLU neural networks shows that a ReLU network corresponds to a piecewise linear function divided into regions defined by an n-dimensional convex polytope. Through this lens, a neural network can be understood as dividing the input space into distinct regions which apply a single linear function for each output neuron. We show that this geometric representation can be used to generate causal explanations for the network's behaviour similar to previous work, but which extracts rules directly from the geometry of Neural Networks with the ReLU activation function, and is therefore an accurate reflection of the network's behaviour.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that the geometric decomposition of ReLU networks into polytopal regions of input space, each governed by a single linear map, can be directly mined to produce causal 'why' and 'why not' explanations that are faithful to the original network, avoiding the fidelity loss of distilled surrogate models for safety-critical autonomous control policies.
Significance. If the geometric-to-causal mapping were made explicit and validated, the approach would usefully extend existing geometric analyses of ReLU networks to the causal-explanation setting in XAI/XRL, preserving exact piecewise-linear behavior. The manuscript correctly highlights the limitations of distillation-based methods and grounds its proposal in the established polytope characterization of ReLU activations.
major comments (2)
- [Abstract] Abstract: the assertion that the geometric representation 'can be used to generate causal explanations' and 'extracts rules directly from the geometry' is presented without any derivation, algorithm, pseudocode, or worked example showing how polytopal boundaries or per-region linear coefficients are mapped onto causal interventions, counterfactuals, or a structural causal model.
- [Abstract] Abstract and stated weakest assumption: the claim that the hyperplane boundaries and linear functions inside each polytope 'correspond to causally meaningful factors' is asserted but not justified; the boundaries are determined solely by learned weights and pre-activation thresholds, supplying an exact functional partition rather than an explicit causal graph or do-operator semantics.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive report. The comments correctly identify that the abstract and core claim would benefit from greater explicitness in mapping geometry to causal semantics. We will revise the manuscript to include the requested derivations, algorithm, and worked example while preserving the central contribution that the exact polytopal decomposition yields faithful explanations without surrogate fidelity loss.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that the geometric representation 'can be used to generate causal explanations' and 'extracts rules directly from the geometry' is presented without any derivation, algorithm, pseudocode, or worked example showing how polytopal boundaries or per-region linear coefficients are mapped onto causal interventions, counterfactuals, or a structural causal model.
Authors: We accept the observation. The current manuscript establishes the geometric equivalence and contrasts it with distillation methods but does not yet supply the explicit extraction procedure. In revision we will insert a dedicated subsection containing (i) a formal mapping from per-region affine coefficients to local causal effects, (ii) pseudocode for enumerating 'why' attributions via feature weights and 'why-not' counterfactuals via adjacent-polytope boundary crossings, and (iii) a fully worked numerical example on a two-layer ReLU policy network that demonstrates the generated explanations match the original network's piecewise-linear behavior exactly. revision: yes
-
Referee: [Abstract] Abstract and stated weakest assumption: the claim that the hyperplane boundaries and linear functions inside each polytope 'correspond to causally meaningful factors' is asserted but not justified; the boundaries are determined solely by learned weights and pre-activation thresholds, supplying an exact functional partition rather than an explicit causal graph or do-operator semantics.
Authors: We agree that the partitions are induced by the network's learned parameters and therefore constitute a functional rather than an exogenous causal graph. In the revision we will explicitly qualify the scope of our causal claims: the hyperplanes delineate changes in the network's internal activation pattern, which, within the model's own computation, function as intervention points. Crossing a boundary corresponds to a do-intervention on the relevant pre-activation that alters downstream linear maps. We will add a short discussion distinguishing this model-internal notion of causality from full structural causal model discovery and will cite the relevant literature on causal abstraction in neural networks to ground the terminology. revision: partial
Circularity Check
No circularity: geometric representation cited as external input; causal extraction presented as new application without self-referential reduction
full rationale
The paper attributes the polytopal decomposition and piecewise-linear structure of ReLU networks to 'recent work on understanding the geometry of ReLU neural networks' without re-deriving or fitting those properties inside the present manuscript. The central move—extracting rules directly from the input-space regions and their associated linear maps—is described as a novel way to produce explanations that remain faithful to the original network, but this step does not define any quantity in terms of itself, rename a fitted parameter as a prediction, or rest on a load-bearing self-citation whose content is unverified. No equations appear in the provided text that would create a self-definitional loop, and the distinction between functional fidelity and causal semantics is an interpretive claim rather than a circular derivation. The derivation chain therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption A ReLU network corresponds to a piecewise linear function divided into regions defined by an n-dimensional convex polytope.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
a ReLU network corresponds to a piecewise linear function divided into regions defined by an n-dimensional convex polytope... each neuron divides the input space by a hyperplane
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
adjacent polytopes can be identified by flipping any of the bits in the bit vector... Hamming distance
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Y . Liu, C. Cole, C. Peterson, and M. Kirby, ‘ReLU Neural Networks, Polyhedral Decompositions, and Persistent Homology’, 2023
work page 2023
-
[2]
G. Katz, C. Barrett, D. Dill, K. Julian, and M. Kochenderfer, ‘Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks’, May 19, 2017, arXiv: arXiv:1702.01135. doi: 10.48550/arXiv.1702.01135
-
[3]
P. Pukowski, J. Spoerhase, and H. Lu, ‘SkelEx and BoundEx - Geo- metrical Framework for Interpretable ReLU Neural Networks’, in 2024 International Joint Conference on Neural Networks (IJCNN), Jun. 2024, pp. 1–8. doi: 10.1109/IJCNN60899.2024.10650882
-
[5]
J. A. Vincent and M. Schwager, ‘Reachable Polyhedral Marching (RPM): An Exact Analysis Tool for Deep-Learned Control Systems’, 2022, doi: 10.48550/ARXIV .2210.08339
work page internal anchor Pith review doi:10.48550/arxiv 2022
-
[6]
X. Yang, T. T. Johnson, H.-D. Tran, T. Yamaguchi, B. Hoxha, and D. Prokhorov, ‘Reachability analysis of deep ReLU neural networks using facet-vertex incidence’, in Proceedings of the 24th International Conference on Hybrid Systems: Computation and Control, Nashville Tennessee: ACM, May 2021, pp. 1–7. doi: 10.1145/3447928.3456650
-
[7]
S. Xu, J. Vaughan, J. Chen, A. Zhang, and A. Sudjianto, ‘Traversing the Local polytopes of ReLU Neural Networks: A Unified Approach for Network Verification’, ArXiv, Nov. 2021,
work page 2021
-
[8]
P. Madumal, T. Miller, L. Sonenberg, and F. Vetere, ‘Explainable Reinforcement Learning Through a Causal Lens’, Nov. 20, 2019, arXiv: arXiv:1905.10958
-
[9]
E. Puiutta and E. M. Veith, ‘Explainable Reinforcement Learning: A Survey’, May 13, 2020, arXiv: arXiv:2005.06247
-
[10]
T. Chakraborti, S. Sreedharan, Y . Zhang, and S. Kambhampati, ‘Plan Explanations as Model Reconciliation: Moving Beyond Explanation as Soliloquy’, in Proceedings of the Twenty-Sixth International Joint Con- ference on Artificial Intelligence, Melbourne, Australia: International Joint Conferences on Artificial Intelligence Organization, Aug. 2017, pp. 156–...
- [11]
- [12]
-
[13]
International Maritime Organisation, ‘Development of a Goal-Based Instrument for Maritime Autonomous Surface Ships (MASS)’, MSC 108/4, 13 February 2024
work page 2024
-
[14]
M. J. Villani et al., ‘PICE: Polyhedral Complex Informed Counter- factual Explanations’, Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, vol. 7, no. 1, Art. no. 1, Oct. 2024, doi: 10.1609/aies.v7i1.31742
-
[15]
B. Hanin and D. Rolnick, ‘Deep ReLU Networks Have Surprisingly Few Activation Patterns’, Oct. 20, 2019, arXiv: arXiv:1906.00904. doi: 10.48550/arXiv.1906.00904
-
[16]
R. P. Stanley, ‘An Introduction to Hyperplane Arrangements’
-
[17]
arXiv preprint arXiv:2401.11188 , year =
R. Balestriero and Y . LeCun, ‘Fast and Exact Enumeration of Deep Networks Partitions Regions’, Jan. 20, 2024, arXiv: arXiv:2401.11188. doi: 10.48550/arXiv.2401.11188
-
[18]
Proceedings of the ACM on Programming Languages , volume =
G. Singh, T. Gehr, M. P ¨uschel, and M. Vechev, “An abstract domain for certifying neural networks,” vol. 3, no. POPL, pp. 1–30, Jan. 2019, doi: https://doi.org/10.1145/3290354
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.