pith. machine review for the scientific record. sign in

arxiv: 2604.09403 · v1 · submitted 2026-04-10 · 🧬 q-bio.MN

Recognition: unknown

Efficient Shapley values computation for Boolean network models of gene regulation

Giang Pham, Paolo Milazzo, Silvia Giulia Galfr\`e

Pith reviewed 2026-05-10 15:58 UTC · model grok-4.3

classification 🧬 q-bio.MN
keywords Boolean networksShapley valuesgene regulatory networksnode importancepropagation algorithmknock-out valuesknock-in valuessystems biology
0
0 comments X

The pith

A propagation method computes exact Shapley values for acyclic Boolean gene networks and good approximations for cyclic ones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework to measure the importance of nodes in Boolean network models of gene regulation using two Shapley-value measures relative to a chosen target node. It defines knock-out Shapley values to capture the effect of disabling a node and knock-in Shapley values to capture the effect of forcing a node on. A propagation algorithm is introduced that follows the logical rules connecting nodes rather than testing every possible combination of states. This yields exact results when the network has no cycles and reliable approximations otherwise. Tests on standard models from the Cell Collective database confirm that the computed rankings match those from slower methods while running much faster.

Core claim

The authors present a Shapley-value framework for assessing node importance in Boolean networks with respect to a target node, consisting of complementary knock-out and knock-in measures. They introduce a propagation-based algorithm that exploits the logical structure of the network to avoid exhaustive enumeration of state combinations. The algorithm is exact for acyclic networks and yields good approximations for cyclic networks, accurately recovering node importance rankings on benchmark models from the Cell Collective database while delivering substantial speed-ups.

What carries the argument

The propagation-based algorithm that traces Shapley contributions along the Boolean logical dependencies from each node to the target without enumerating all 2^n state combinations.

If this is right

  • Node importance rankings become feasible to compute for networks too large for enumeration.
  • Intervention targets in gene regulatory networks can be prioritized with far less computation.
  • Exact results are available whenever the regulatory network contains no feedback loops.
  • Approximations remain sufficient to recover correct orderings on real biological models.
  • The approach scales to models where exhaustive simulation is impossible.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The propagation idea could extend to compute other cooperative solution concepts on logical networks.
  • Coupling the method with network inference algorithms might enable rapid identification of candidate regulators from experimental data.
  • The speed-ups open the possibility of analyzing dynamic influence during network growth or perturbation studies.
  • Similar propagation techniques may apply to multi-valued or probabilistic logic models of regulation.

Load-bearing premise

The logical dependencies among nodes in the Boolean network allow Shapley contributions to be propagated accurately without checking every possible combination of node states.

What would settle it

A direct comparison on a small cyclic network from the Cell Collective benchmarks showing that the propagation method's node importance ranking differs substantially from the ranking obtained by full enumeration.

Figures

Figures reproduced from arXiv: 2604.09403 by Giang Pham, Paolo Milazzo, Silvia Giulia Galfr\`e.

Figure 1
Figure 1. Figure 1: A Boolean network. For the input configuration S = {C}, the obtained attractor contains 5 states with the summarized state is a {C} = {⟨A, 0⟩,⟨B, 0⟩,⟨C, 1⟩,⟨D, 0⟩,⟨E, 1⟩,⟨F, 0⟩ ⟨G, 1⟩,⟨H, 0⟩,⟨I, 1⟩} (3) B. Shapley Value The Shapley value is a concept from cooperative game theory that assigns a fair contribution score to each player based on its marginal effect across all coalitions it might participate in.… view at source ↗
Figure 2
Figure 2. Figure 2: A Binary Boolean Network with the corre [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: A BBN network with cycle and correspond [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: NDCG of KO and KI values and relative RMSE under the Shapley weighting scheme. Models [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Ratio between the runtime of the standard analysis and the propagation method. Models are grouped [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
read the original abstract

Identifying dynamically influential nodes in biological networks is a central problem in systems biology, particularly for prioritizing intervention targets in gene regulatory networks. In this paper, we propose a Shapley-value-based framework for assessing the importance of nodes in a Boolean network with respect to a given target node. The framework comprises two complementary measures: the Knock-out and the Knock-in Shapley values. Moreover, we present a propagation-based method that enables their efficient computation. By exploiting the logical structure of the network, the method avoids exhaustive simulations. The approach is exact for acyclic networks and provides good approximations for cyclic networks. Evaluation on benchmark models from the Cell Collective database shows that the propagation method accurately recovers node importance rankings while achieving substantial speed-ups.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a propagation-based method to compute Knock-out and Knock-in Shapley values for assessing node importance in Boolean gene regulatory networks with respect to a target node. It claims the method is exact for acyclic networks by exploiting logical structure to avoid exhaustive state enumeration, provides good approximations for cyclic networks, and empirically recovers accurate node importance rankings on Cell Collective benchmark models while delivering substantial computational speed-ups.

Significance. If the central claims hold, the work would offer a practical tool for prioritizing intervention targets in systems biology by adapting cooperative game theory to network propagation, reducing the computational burden of Shapley value calculations that typically require 2^n enumerations. The exactness guarantee on acyclic graphs and the use of standard Shapley definitions (without invented parameters) are strengths; however, the reliance on empirical ranking recovery for cyclic cases limits the strength of the significance assessment.

major comments (2)
  1. [Methods (cyclic approximation)] Propagation method for cyclic networks (as described following the acyclic exact case): the claim of 'good approximations' that 'accurately recovers node importance rankings' rests entirely on benchmark performance without any derivation of error bounds, analysis of how feedback loops distort propagated contributions, or worst-case guarantees. This is load-bearing for the central claim, as cycle-induced dependencies could violate the implicit independence assumptions in the propagation rules and degrade ranking fidelity even if average Cell Collective results appear acceptable.
  2. [Results/Evaluation] Evaluation on Cell Collective benchmarks: while ranking recovery and speed-ups are asserted, no quantitative details (e.g., specific speed-up factors, Kendall-tau or Spearman correlation values, or comparison against exhaustive enumeration baselines) are supplied in the reported results, undermining the ability to verify the 'substantial speed-ups' and 'accurate' recovery claims.
minor comments (2)
  1. [Introduction/Methods] Notation for Knock-out vs. Knock-in Shapley values could be clarified with explicit equations early in the methods to distinguish the two complementary measures.
  2. [Abstract] The abstract states the approach 'avoids exhaustive simulations' but does not specify the network properties (e.g., update rules or state-space reduction) that enable this; a short clarifying sentence would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below with clarifications and indicate planned revisions where appropriate to strengthen the presentation and scope of our claims.

read point-by-point responses
  1. Referee: [Methods (cyclic approximation)] Propagation method for cyclic networks (as described following the acyclic exact case): the claim of 'good approximations' that 'accurately recovers node importance rankings' rests entirely on benchmark performance without any derivation of error bounds, analysis of how feedback loops distort propagated contributions, or worst-case guarantees. This is load-bearing for the central claim, as cycle-induced dependencies could violate the implicit independence assumptions in the propagation rules and degrade ranking fidelity even if average Cell Collective results appear acceptable.

    Authors: We agree that the approximation for cyclic networks lacks theoretical error bounds or worst-case guarantees and is validated empirically. The propagation rules implicitly assume limited interference from feedback, which cycles can violate. Our focus is on practical utility for biological networks rather than universal guarantees. We will revise the manuscript to add an explicit limitations subsection discussing how feedback loops may affect propagated contributions, along with additional empirical quantification of approximation errors (e.g., deviation from exact Shapley values on smaller cyclic subnetworks) to better support the 'good approximations' claim. revision: partial

  2. Referee: [Results/Evaluation] Evaluation on Cell Collective benchmarks: while ranking recovery and speed-ups are asserted, no quantitative details (e.g., specific speed-up factors, Kendall-tau or Spearman correlation values, or comparison against exhaustive enumeration baselines) are supplied in the reported results, undermining the ability to verify the 'substantial speed-ups' and 'accurate' recovery claims.

    Authors: We accept this point. While the results are shown in figures, the main text does not report explicit numerical metrics. In the revised version we will insert specific quantitative details, including average Kendall-tau and Spearman correlations for ranking recovery across the benchmark set, mean and range of speed-up factors versus exhaustive enumeration, and direct baseline comparisons, to make the performance claims verifiable. revision: yes

Circularity Check

0 steps flagged

No circularity: standard Shapley definition applied via network propagation, validated externally

full rationale

The derivation begins from the standard cooperative-game definition of Shapley values and applies it to Boolean networks via a propagation scheme that follows the logical update rules. Exactness on acyclic graphs follows directly from the absence of feedback (no enumeration needed), while cyclic approximations are presented as empirical and benchmarked against the independent Cell Collective database models. No equation reduces a claimed prediction to a fitted input by construction, no self-citation supplies a load-bearing uniqueness theorem or ansatz, and the method does not rename a known result as a new derivation. The central claims therefore remain independent of the paper's own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that Boolean network logic supports value propagation and on the standard mathematical axioms of Shapley values; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • standard math Shapley value axioms (efficiency, symmetry, dummy player, additivity) from cooperative game theory
    The framework directly adopts the standard Shapley value definition to define node importance.

pith-pipeline@v0.9.0 · 5418 in / 1312 out tokens · 19277 ms · 2026-05-10T15:58:03.331790+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

24 extracted references

  1. [1]

    Metabolic stability and epigenesis in randomly constructed genetic nets,

    S. A. Kauffman, “Metabolic stability and epigenesis in randomly constructed genetic nets,”Journal of theoretical biology, vol. 22, no. 3, pp. 437–467, 1969

  2. [2]

    The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in drosophila melanogaster,

    “The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in drosophila melanogaster,”Journal of Theoretical Biology, vol. 223, no. 1, pp. 1–18, 2003

  3. [3]

    The yeast cell-cycle network is robustly designed,

    F. Li, T. Long, Y . Lu, Q. Ouyang, and C. Tang, “The yeast cell-cycle network is robustly designed,”Proceedings of the National Academy of Sciences, vol. 101, no. 14, pp. 4781– 4786, 2004

  4. [4]

    A logical model provides insights into t cell receptor signaling,

    J. Saez-Rodriguez, L. Simeoni, J. A. Lindquist, R. Hemen- way, U. Bommhardt, B. Arndt, U.-U. Haus, R. Weismantel, E. D. Gilles, S. Klamt, and B. Schraven, “A logical model provides insights into t cell receptor signaling,”PLOS Computational Biology, vol. 3, 08 2007

  5. [5]

    Gatekeepr: an r shiny application for the identification of nodes with high dynamic impact in boolean networks,

    F. M. Weidner, N. Ikonomi, S. D. Werle, J. D. Schwab, and H. A. Kestler, “Gatekeepr: an r shiny application for the identification of nodes with high dynamic impact in boolean networks,”Bioinformatics, vol. 40, p. btae007, 01 2024

  6. [6]

    Activities and sensitiv- ities in boolean network models,

    I. Shmulevich and S. A. Kauffman, “Activities and sensitiv- ities in boolean network models,”Physical review letters, vol. 93, no. 4, p. 048701, 2004

  7. [7]

    Impact of individual nodes in boolean network dynamics,

    F. Ghanbarnejad and K. Klemm, “Impact of individual nodes in boolean network dynamics,”Europhysics Letters, vol. 99, no. 5, p. 58006, 2012

  8. [8]

    Vital nodes identification in complex networks,

    L. L ¨u, D. Chen, X.-L. Ren, Q.-M. Zhang, Y .-C. Zhang, and T. Zhou, “Vital nodes identification in complex networks,” Physics reports, vol. 650, pp. 1–63, 2016

  9. [9]

    Harmonic analysis of boolean networks: determinative power and perturba- tions,

    R. Heckel, S. Schober, and M. Bossert, “Harmonic analysis of boolean networks: determinative power and perturba- tions,”EURASIP Journal on Bioinformatics and Systems Biology, vol. 2013, p. 6, May 2013

  10. [10]

    Logical reduction of bio- logical networks to their most determinative components,

    M. T. Matache and V . Matache, “Logical reduction of bio- logical networks to their most determinative components,” Bulletin of Mathematical Biology, vol. 78, pp. 1520–1545, Jul 2016

  11. [11]

    Identification of biologically essential nodes via determinative power in logical models of cellular processes,

    B. Pentzien, T. Heckel,et al., “Identification of biologically essential nodes via determinative power in logical models of cellular processes,”Frontiers in Physiology, vol. 9, p. 1185, 2018

  12. [12]

    Capturing dynamic rele- vance in boolean networks using graph theoretical mea- sures,

    F. M. Weidner, J. D. Schwab, S. D. Werle, N. Ikonomi, L. Lausser, and H. A. Kestler, “Capturing dynamic rele- vance in boolean networks using graph theoretical mea- sures,”Bioinformatics, vol. 37, no. 20, pp. 3530–3537, 2021

  13. [13]

    Preliminary results on shapley value notions and propagation methods for boolean net- works,

    G. Pham and P. Milazzo, “Preliminary results on shapley value notions and propagation methods for boolean net- works,” inInternational Symposium: From Data to Models and Back, pp. 90–112, Springer, 2023

  14. [14]

    Gene importance assessment based on shapley values for boolean networks: Valida- tion and scalability analysis,

    G. Pham and P. Milazzo, “Gene importance assessment based on shapley values for boolean networks: Valida- tion and scalability analysis,” inInternational Symposium: From Data to Models and Back, pp. 19–33, Springer, 2024

  15. [15]

    A survey of gene regulatory networks modelling methods: from dif- ferential equations, to boolean and qualitative bioinspired models,

    R. Barbuti, R. Gori, P. Milazzo, and L. Nasti, “A survey of gene regulatory networks modelling methods: from dif- ferential equations, to boolean and qualitative bioinspired models,”Journal of Membrane Computing, vol. 2, no. 3, pp. 207–226, 2020

  16. [16]

    Concepts in boolean network modeling: What do they all mean?,

    J. D. Schwab, S. D. K ¨uhlwein, N. Ikonomi, M. K ¨uhl, and H. A. Kestler, “Concepts in boolean network modeling: What do they all mean?,”Computational and structural biotechnology journal, vol. 18, pp. 571–582, 2020

  17. [17]

    A comprehensive review of the use of shapley value to assess node importance in the analysis of biological networks,

    G. Pham and P. Milazzo, “A comprehensive review of the use of shapley value to assess node importance in the analysis of biological networks,”Computers in Biology and Medicine, vol. 184, p. 100185, 2025

  18. [18]

    Attractor detection and enumera- tion algorithms for boolean networks,

    T. Mori and T. Akutsu, “Attractor detection and enumera- tion algorithms for boolean networks,”Computational and Structural Biotechnology Journal, vol. 20, pp. 2512–2520, 2022

  19. [19]

    Network biology: under- standing the cell’s functional organization,

    A.-L. Barabasi and Z. N. Oltvai, “Network biology: under- standing the cell’s functional organization,”Nature Reviews Genetics, vol. 5, no. 2, pp. 101–113, 2004

  20. [20]

    Structure and evolution of transcrip- tional regulatory networks,

    M. M. Babu, N. M. Luscombe, L. Aravind, M. Gerstein, and S. A. Teichmann, “Structure and evolution of transcrip- tional regulatory networks,”Current Opinion in Structural Biology, vol. 14, no. 3, pp. 283–291, 2004

  21. [21]

    Network motifs: theory and experimental ap- proaches,

    U. Alon, “Network motifs: theory and experimental ap- proaches,”Nature Reviews Genetics, vol. 8, no. 6, pp. 450– 461, 2007

  22. [22]

    A fast and effective heuristic for the feedback arc set problem,

    P. Eades, X. Lin, and W. F. Smyth, “A fast and effective heuristic for the feedback arc set problem,”Information processing letters, vol. 47, no. 6, pp. 319–323, 1993

  23. [23]

    Emergent decision-making in biological signal transduc- tion networks,

    T. Helikar, J. Konvalina, J. Heidel, and J. A. Rogers, “Emergent decision-making in biological signal transduc- tion networks,”Proceedings of the National Academy of Sciences, vol. 105, no. 6, pp. 1913–1918, 2008

  24. [24]

    Systems perturbation analysis of a large-scale signal transduction model reveals potentially influential candidates for cancer therapeutics,

    B. L. Puniya, L. Allen, C. Hochfelder, M. Majumder, and T. Helikar, “Systems perturbation analysis of a large-scale signal transduction model reveals potentially influential candidates for cancer therapeutics,”Frontiers in Bioengi- neering and Biotechnology, vol. 4, p. 10, 2016. 15 APPENDIX SUPPLEMENTARYMATERIAL S1: ADDITIONALEXPERIMENTS Fig. S1: Ratio be...