pith. machine review for the scientific record. sign in

arxiv: 2604.16426 · v1 · submitted 2026-04-04 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

Functional Similarity Metric for Neural Networks: Overcoming Parametric Ambiguity via Activation Region Analysis

Authors on Pith no claims yet

Pith reviewed 2026-05-13 17:44 UTC · model grok-4.3

classification 💻 cs.LG
keywords functional similarityactivation regionsReLU networksneuron matchingmodel mergingMinHashJaccard indexpermutation invariance
0
0 comments X

The pith

Neural networks can be compared by the topology of their activation regions rather than their raw weights, removing permutation and scaling ambiguities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks a stable way to measure functional similarity between neural networks that remains unchanged when the same computation is expressed through different parameter choices. ReLU networks admit algebraic symmetries—neuron permutations and positive scalings—that make direct weight comparisons unreliable even under tiny training perturbations. By first L2-normalizing weights and compensating layers, then encoding each neuron’s activation region as a binary vector over a data sample, and finally approximating Jaccard overlap via MinHash before solving the assignment problem, the method produces a canonical functional metric. The resulting distance mitigates neuron flickering and supports tasks such as model merging and pruning where parameter-based comparisons fail.

Core claim

The central claim is that discrete binary signatures of neuron activation regions, obtained after L2-normalization to eliminate scaling, combined with MinHash approximation of the Jaccard index and Hungarian assignment for neuron matching, yield a functional similarity metric that is invariant under neuron permutation and positive diagonal scaling and remains stable under small weight perturbations.

What carries the argument

Binary functional signature of activation regions: the pattern of which neurons fire for each input in a finite sample, used as a discrete topological descriptor of each neuron’s role.

If this is right

  • Neuron matching becomes stable across independently trained models, allowing reliable identification of corresponding units for merging.
  • Pruning decisions can be guided by functional redundancy rather than weight magnitude, reducing the risk of removing critical neurons.
  • Transfer learning can align layers by functional role instead of index, improving initialization quality.
  • Model merging can proceed without post-merge fine-tuning to correct for flickering neurons.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same activation-region signatures could track how a network’s functional decomposition evolves during training.
  • If extended to other piecewise-linear activations, the metric might serve as a general tool for comparing networks across different activation families.
  • The approach suggests a route to canonical forms for entire networks that could be used to detect functional equivalence classes in large model zoos.

Load-bearing premise

Binary activation patterns over a chosen finite data sample are sufficient to distinguish functionally distinct neurons without introducing new ambiguities from sampling or discretization.

What would settle it

Two networks that realize identical input-output functions yet produce substantially different binary activation signatures on the same data sample would show the metric fails to capture functional equivalence.

Figures

Figures reproduced from arXiv: 2604.16426 by Kutomanov Hennadii.

Figure 1
Figure 1. Figure 1: Illustration of neuron j’s activation region restricted to a bounded area K (ARj [PITH_FULL_IMAGE:figures/full_fig_p028_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Visualization of hidden layer neuron decision boundaries for the two trained networks: (a) Network 1 and (b) Network 2. Visual comparison of these two figures clearly demonstrates that despite training on the same task and achieving similar accuracy (approximately 0.99 for both networks, as stated in the methodology), the internal weight configurations and, consequently, the geometry of individual neuron d… view at source ↗
read the original abstract

As modern deep learning architectures grow in complexity, representational ambiguity emerges as a critical barrier to their interpretability and reliable merging. For ReLU networks, identical functional mappings can be achieved through entirely different weight configurations due to algebraic symmetries: neuron permutation and positive diagonal scaling. Consequently, traditional parameter-based comparison methods exhibit extreme instability to slight weight perturbations during training. This paper proposes a mathematically grounded approach to constructing a stable canonical representation of neural networks and a robust functional similarity metric. We shift focus from comparing raw weights to analyzing the topology of neuron activation regions. The algorithm first eliminates scaling ambiguity via L2-normalization of weight vectors with subsequent layer compensation. Next, discrete approximations of activation regions are generated as binary functional signatures evaluated over a data sample. To overcome the computational bottleneck of comparing large binary vectors, we adapt Locality-Sensitive Hashing, specifically MinHash, providing a fast and statistically precise approximation of the Jaccard index. The final cross-network neuron matching is formulated as a linear sum assignment problem solved via the Hungarian algorithm. We demonstrate theoretically and experimentally that our metric mitigates the neuron "flickering" effect and exhibits exceptional robustness to minor weight perturbations. This framework provides a solid foundation for model merging, transfer learning, objective assessment during pruning, and Explainable AI paradigms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a functional similarity metric for ReLU networks to resolve parametric ambiguities (neuron permutation and positive scaling) that cause instability in weight-based comparisons. The approach first applies L2-normalization to weight vectors with layer compensation, then constructs discrete binary signatures of activation regions by evaluating ReLU patterns on a finite data sample, approximates Jaccard similarity via MinHash, and solves neuron matching as a linear assignment problem using the Hungarian algorithm. The central claim is that this yields a stable metric that mitigates neuron flickering and exhibits strong robustness to minor weight perturbations, supported by theoretical arguments and experiments, with applications to model merging, pruning, and XAI.

Significance. If the robustness claims hold under the stated construction, the metric would provide a useful tool for functional comparison of networks that is less sensitive to training-induced parameter variations than direct weight matching, potentially aiding reliable model merging and interpretability studies. The shift to activation-region topology is a coherent direction, though its practical value depends on whether finite-sample signatures reliably capture the relevant functional distinctions.

major comments (2)
  1. [§3] §3 (Activation region analysis and binary signatures): The theoretical claim that the metric mitigates flickering and is robust to perturbations rests on the unstated assumption that a finite data sample produces binary signatures that recover the full topological partition of activation regions. Because ReLU regions are unbounded polyhedral cones whose boundaries are hyperplanes, a finite sample can miss entire cones or boundaries in high-dimensional space, allowing functionally inequivalent networks to receive identical signatures. This assumption is load-bearing for the central robustness result and requires either sample-size bounds or explicit error analysis.
  2. [Abstract and §4] Abstract and §4 (theoretical demonstration): The abstract states that theoretical and experimental support is provided for robustness, yet no derivation details, error bounds on the MinHash approximation, or analysis of how sample size and data distribution affect the Jaccard estimate or final similarity score are supplied. Without these, it is not possible to verify that the metric remains parameter-free or that the reported stability is not an artifact of the chosen sampling procedure.
minor comments (1)
  1. [§2] Notation for the normalized weight vectors and the subsequent layer compensation step could be clarified with an explicit equation showing how scaling factors are propagated across layers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the insightful and constructive comments. We provide point-by-point responses to the major comments below, indicating where revisions will be made to address the concerns.

read point-by-point responses
  1. Referee: [§3] §3 (Activation region analysis and binary signatures): The theoretical claim that the metric mitigates flickering and is robust to perturbations rests on the unstated assumption that a finite data sample produces binary signatures that recover the full topological partition of activation regions. Because ReLU regions are unbounded polyhedral cones whose boundaries are hyperplanes, a finite sample can miss entire cones or boundaries in high-dimensional space, allowing functionally inequivalent networks to receive identical signatures. This assumption is load-bearing for the central robustness result and requires either sample-size bounds or explicit error analysis.

    Authors: We acknowledge the validity of this concern. The use of a finite data sample for constructing binary signatures is indeed an approximation that may not capture all unbounded activation regions in high dimensions. Our robustness claims are based on the observation that for small weight perturbations, the sampled signatures remain consistent in practice, as demonstrated in our experiments. To strengthen the theoretical foundation, we will revise the manuscript to include a discussion of the sampling limitations, potential error sources, and practical guidelines for choosing sample sizes relative to the input dimensionality. We will also reference results on the number of linear regions in ReLU networks to provide probabilistic guarantees where possible. This constitutes a partial revision as deriving tight, general sample-size bounds may require substantial additional analysis. revision: partial

  2. Referee: [Abstract and §4] Abstract and §4 (theoretical demonstration): The abstract states that theoretical and experimental support is provided for robustness, yet no derivation details, error bounds on the MinHash approximation, or analysis of how sample size and data distribution affect the Jaccard estimate or final similarity score are supplied. Without these, it is not possible to verify that the metric remains parameter-free or that the reported stability is not an artifact of the chosen sampling procedure.

    Authors: We agree that additional details would enhance the clarity and verifiability of our theoretical claims. The MinHash approximation for Jaccard similarity has standard error bounds from the literature, which we will now explicitly derive and include in the revised §4. Furthermore, we will add an analysis of the impact of sample size on the Jaccard estimate, including convergence rates and sensitivity to the data distribution. Regarding parameter-freeness, the core metric computation does not require tuning beyond the choice of sample, which we will clarify. These additions will be incorporated in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the metric derivation

full rationale

The paper constructs its functional similarity metric through a sequence of independent algorithmic steps: L2-normalization of weight vectors to eliminate scaling ambiguity, generation of discrete binary signatures from ReLU activation patterns evaluated on a data sample, MinHash approximation of Jaccard similarity, and Hungarian algorithm for neuron matching. These operations are defined directly from the network weights and input data without reducing any output quantity to a fitted parameter or self-referential definition. No self-citations are invoked as load-bearing premises for uniqueness or ansatz choices, and the theoretical demonstration of robustness is presented as following from the construction rather than presupposing the target result. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract only; no explicit free parameters, axioms, or invented entities are detailed beyond standard assumptions about ReLU activation regions defining functional behavior.

axioms (1)
  • domain assumption Activation regions of ReLU networks determine functional equivalence up to permutation and positive scaling symmetries
    Invoked when shifting comparison from weights to activation topology

pith-pipeline@v0.9.0 · 5523 in / 1108 out tokens · 34340 ms · 2026-05-13T17:44:38.439701+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 1 internal anchor

  1. [1]

    K., Hayase, J., and Srinivasa, S

    Samuel K. Ainsworth, Jonathan Hayase, and Siddhartha S. Srinivasa. Git re-basin: Merging models modulo permutation symmetries. arXiv preprint arXiv:2209.04836, Sep 2022. Version 6, last revised 1 Mar 2023

  2. [2]

    Antonov and Viktor M

    Igor A. Antonov and Viktor M. Saleev. An economic method of comput- ing lpτ -sequences. USSR Computational Mathematics and Mathematical Physics, 19(1):252–256, 1979

  3. [3]

    Mathe- matical Programming Models for Exact and Interpretable Formulation of Neural Networks

    Masoud Ataei, Edrin Hasaj, Jacob Gipp, and Sepideh Forouzi. Mathe- matical Programming Models for Exact and Interpretable Formulation of Neural Networks. arXiv preprint arXiv:2504.14356 , Apr 2025

  4. [4]

    Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K. Warmuth. Learnability and the vapnik-chervonenkis dimension. Journal of the ACM , 36(4):929–965, 1989

  5. [5]

    Andrei Z. Broder. On the resemblance and containment of documents. In Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171), pages 21–29. IEEE, 1997

  6. [6]

    LSH-preserving functions and their applications

    Flavio Chierichetti and Ravi Kumar. LSH-preserving functions and their applications. Journal of the ACM , 62(5):33:1–33:28, 2015. Tight char- acterization of LSH-preserving transformations

  7. [7]

    A Probabilistic Theory of Pattern Recognition , volume 31 of Stochastic Modelling and Applied Probability

    Luc Devroye, László Györfi, and Gábor Lugosi. A Probabilistic Theory of Pattern Recognition , volume 31 of Stochastic Modelling and Applied Probability. Springer, New York, 1996

  8. [8]

    Deep learning,

    Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning,

  9. [9]

    Chapter 6: Deep Feedforward Networks. 88

  10. [10]

    Delving deep into rectifiers: Surpassing human-level performance on imagenet classi- fication

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classi- fication. IEEE International Conference on Computer Vision (ICCV) , pages 1026–1034, 2015

  11. [11]

    Approximate nearest neighbors: To- wards removing the curse of dimensionality

    Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: To- wards removing the curse of dimensionality. In Proceedings of the Thir- tieth Annual ACM Symposium on Theory of Computing (STOC) , pages 604–613. ACM, 1998

  12. [12]

    Cs231n: Convolutional neural networks for visual recognition, 2015

    Andrej Karpathy. Cs231n: Convolutional neural networks for visual recognition, 2015. Stanford University Course Notes

  13. [13]

    Kearns and Umesh V

    Michael J. Kearns and Umesh V. Vazirani. An Introduction to Compu- tational Learning Theory . MIT Press, Cambridge, MA, 1994

  14. [14]

    Mining of Massive Datasets

    Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman. Mining of Massive Datasets . Cambridge University Press, 2nd edition, 2014. Chapter 3 covers Locality-Sensitive Hashing

  15. [15]

    Hopcroft

    Yixuan Li, Jason Yosinski, Jeff Clune, Hod Lipson, and John E. Hopcroft. Convergent learning: Do different neural networks learn the same representations? In Proceedings of the 4th International Confer- ence on Learning Representations (ICLR) , 2016. Published as a confer- ence paper at ICLR 2016

  16. [16]

    Normalization and effective learning rates in reinforcement learning

    Clare Lyle, Zeyu Zheng, Khimya Khetarpal, James Martens, Hado van Hasselt, Razvan Pascanu, and Will Dabney. Normalization and effective learning rates in reinforcement learning. arXiv preprint arXiv:2407.01800, Jul 2024

  17. [17]

    Marissa A. Masden. Algorithmic determination of the combinatorial structure of the linear regions of ReLU neural networks. arXiv preprint arXiv:2207.07696, Jul 2022

  18. [18]

    McKay, Richard J

    Michael D. McKay, Richard J. Beckman, and William J. Conover. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 21(2):239–245, 1979

  19. [19]

    Understanding the role of training regimes in contin- ual learning

    Seyed Iman Mirzadeh, Mehrdad Farajtabar, Razvan Pascanu, and Has- san Ghasemzadeh. Understanding the role of training regimes in contin- ual learning. arXiv preprint arXiv:2006.06958 , Jun 2020. 89

  20. [20]

    Founda- tions of Machine Learning

    Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Founda- tions of Machine Learning . Adaptive Computation and Machine Learn- ing. MIT Press, Cambridge, MA, 2nd edition, 2018

  21. [21]

    Meth- ods for interpreting and understanding deep neural networks

    Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. Meth- ods for interpreting and understanding deep neural networks. In Digital Signal Processing, volume 73, pages 1–15, 2017

  22. [22]

    Vinod Nair and Geoffrey E. Hinton. Rectified Linear Units Improve Restricted Boltzmann Machines . 2010

  23. [23]

    Low-discrepancy and low-dispersion sequences

    Harald Niederreiter. Low-discrepancy and low-dispersion sequences. Journal of Number Theory , 30(1):51–70, 1988

  24. [24]

    Art B. Owen. Quasi-monte carlo sampling. SIGGRAPH Course Notes ,

  25. [25]

    Monte Carlo and Quasi-Monte Carlo Methods in Scientific Com- puting

  26. [26]

    ReLU Neural Networks with Linear Layers are Biased Towards Single- and Multi- Index Models

    Suzanna Parkinson, Greg Ongie, and Rebecca Willett. ReLU Neural Networks with Linear Layers are Biased Towards Single- and Multi- Index Models. arXiv preprint arXiv:2305.15598 , May 2023. Last revised 17 Mar 2025

  27. [27]

    Advancing constrained monotonic neural networks: Achieving universal approxi- mation beyond bounded activations

    Davide Sartor, Alberto Sinigaglia, and Gian Antonio Susto. Advancing constrained monotonic neural networks: Achieving universal approxi- mation beyond bounded activations. arXiv preprint arXiv:2505.02537 , May 2025. Last revised 6 May 2025

  28. [28]

    Neural predictors for detecting and removing re- dundant information

    Jürgen Schmidhuber. Neural predictors for detecting and removing re- dundant information. pages 1–18, 2000

  29. [29]

    Scaling up exact neural network compression by ReLU stability

    Thiago Serra, Abhinav Kumar, Xin Yu, and Srikumar Ramalingam. Scaling up exact neural network compression by ReLU stability. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems 34 (NeurIPS 2021) , pages 26097–26109, 2021

  30. [30]

    Understanding Machine Learning: From Theory to Algorithms

    Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms . Cambridge University Press, 2014

  31. [31]

    Open Problems in Mechanistic Interpretability

    Lee Sharkey, Bilal Chughtai, Joshua Batson, Jack Lindsey, Jeff Wu, Lucius Bushnaq, Nicholas Goldowsky-Dill, Stefan Heimersheim, Ale- jandro Ortega, Joseph Bloom, Stella Biderman, Adria Garriga-Alonso, 90 Arthur Conmy, Neel Nanda, Jessica Rumbelow, Martin Wattenberg, Nandi Schoots, Joseph Miller, Eric J. Michaud, Stephen Casper, Max Tegmark, William Saunde...

  32. [32]

    Ilya M. Sobol’ . On the distribution of points in a cube and the approx- imate evaluation of integrals. USSR Computational Mathematics and Mathematical Physics , 7(4):86–112, 1967

  33. [33]

    Eduardo D. Sontag. Vc dimension of neural networks. In Christopher M. Bishop, editor, Neural Networks and Machine Learning , pages 69–95. Springer, Berlin, 1998

  34. [34]

    Joseph Tatro, Pin-Yu Chen, Payel Das, Igor Melnyk, Prasanna Sat- tigeri, and Rongjie Lai

    N. Joseph Tatro, Pin-Yu Chen, Payel Das, Igor Melnyk, Prasanna Sat- tigeri, and Rongjie Lai. Optimizing mode connectivity via neuron align- ment. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems 33 (NeurIPS 2020) , 2020. arXiv:2009.02439 [cs.LG]

  35. [35]

    Vo, Tho H

    Viet-Hoang Tran, Thieu N. Vo, Tho H. Tran, An T. Nguyen, and Tan M. Nguyen. Monomial matrix group equivariant neural functional networks. In Advances in Neural Information Processing Systems 37 (NeurIPS 2024), Sep 2024. Published at NeurIPS 2024. arXiv version 3, last revised 13 Mar 2025

  36. [36]

    V. N. Vapnik and A. Ya. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Proba- bility and Its Applications , 16(2):264–280, 1971. English translation by B. Seckler of the Russian paper published in Dokl. Akad. Nauk SSSR 181(4):781-783, 1968

  37. [37]

    Vladimir N. Vapnik. Statistical Learning Theory . Wiley-Interscience, New York, 1998

  38. [38]

    Empirical evalua- tion of rectified activations in convolutional network

    Bing Xu, Naiyan Wang, Tianqi Chen, and Mu Li. Empirical evalua- tion of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853, 2015

  39. [39]

    Efficient Neural Network Robustness Certification with General Activation Functions

    Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel. Efficient neural network robustness certification with general 91 activation functions. In S. Bengio, H. Wallach, H. Larochelle, K. Grau- man, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural In- formation Processing Systems 31 (NIPS 2018) , 2018. arXiv:1811.00866 [cs.LG]. 92