pith. sign in

arxiv: 2402.03554 · v3 · submitted 2024-02-05 · 💻 cs.IT · math.IT· math.PR

Explicit Formula for Partial Information Decomposition

Pith reviewed 2026-05-24 04:09 UTC · model grok-4.3

classification 💻 cs.IT math.ITmath.PR
keywords partial information decompositioninformation atomsdo-operationWilliams-Beer axiomsmutual informationredundant informationsynergistic informationunique information
0
0 comments X

The pith

A do-operation that sets marginals to chosen values supplies the first explicit formula for the atoms of partial information decomposition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the open problem of decomposing the mutual information between one random variable and a pair of others into unique, redundant, and synergistic atoms. Williams and Beer introduced axioms in 2010 that any such decomposition must obey, yet no closed-form expression had been shown to meet them for arbitrary variables. The authors define a do-operation that fixes a chosen marginal distribution while leaving the rest of the joint unchanged. This operation yields an explicit formula for each atom that satisfies the original axioms plus later consistency requirements. If the construction holds, analysts gain a direct computational route to these fine-grained information quantities without iterative optimization or case-by-case definitions.

Core claim

The central claim is that the do-operation—setting a prescribed marginal to a target distribution—produces explicit expressions for the unique, redundant, and synergistic information atoms that simultaneously obey Williams and Beer’s axioms together with the additional properties identified in subsequent work.

What carries the argument

The do-operation, which sets a chosen marginal distribution to a desired value while preserving the remaining joint structure.

If this is right

  • The formula directly computes unique, redundant, and synergistic information for any finite alphabet size.
  • The resulting atoms satisfy the original Williams-Beer axioms by construction.
  • The atoms also obey the additional consistency properties required by later studies.
  • The method applies uniformly to arbitrary random variables rather than special cases.
  • No auxiliary optimization or iterative procedure is needed once the joint distribution is given.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same construction may extend to continuous variables if the do-operation can be defined via densities or measures.
  • Because the operation mirrors Pearl’s do-calculus, the atoms could acquire causal interpretations in systems already analyzed with interventions.
  • Implementations on small discrete systems would allow direct numerical checks against exhaustive enumeration of all possible atom assignments.
  • If the formula scales, it could serve as a building block for higher-order decompositions involving more than three variables.

Load-bearing premise

The do-operation must be unambiguously defined for every joint distribution of random variables and must generate atoms that meet every listed axiom without requiring further unstated restrictions.

What would settle it

A concrete joint distribution on three or four binary variables where the formula produces atom values that violate at least one Williams-Beer axiom or disagree with the unique known values in a fully enumerated small system.

Figures

Figures reproduced from arXiv: 2402.03554 by Andrew Clark, Aobo Lyu, Netanel Raviv.

Figure 1
Figure 1. Figure 1: A pictorial representation of Partial Information D [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
read the original abstract

Mutual information between two random variables is a well-studied notion, whose understanding is fairly complete. Mutual information between one random variable and a pair of other random variables, however, is a far more involved notion. Specifically, Shannon's mutual information does not capture fine-grained interactions between those three variables, resulting in limited insights in complex systems. To capture these fine-grained interactions, in 2010 Williams and Beer proposed to decompose this mutual information to information atoms, called unique, redundant, and synergistic, and proposed several operational axioms that these atoms must satisfy. In spite of numerous efforts, a general formula which satisfies these axioms has yet to be found. Inspired by Judea Pearl's do-calculus, we resolve this open problem by introducing the do-operation, an operation over the variable system which sets a certain marginal to a desired value, which is distinct from any existing approaches. Using this operation, we provide the first explicit formula for calculating the information atoms so that Williams and Beer's axioms are satisfied, as well as additional properties from subsequent studies in the field.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript claims to resolve the long-standing open problem of finding an explicit formula for the partial information decomposition (PID) atoms (unique, redundant, and synergistic) that satisfy the Williams-Beer axioms from 2010, along with additional properties identified in later work. It introduces a novel 'do-operation' (inspired by Pearl's do-calculus) that sets a chosen marginal to a desired value and uses this operation to construct the atoms via an explicit formula.

Significance. If the do-operation is shown to be canonically defined and the resulting formula satisfies non-negativity together with the full set of Williams-Beer lattice axioms for arbitrary joint distributions, the result would be a substantial contribution to information theory, supplying the first closed-form solution to a problem that has remained unresolved despite extensive prior effort.

major comments (1)
  1. [Definition and properties of the do-operation; theorem on axiom compliance] The do-operation is the load-bearing primitive for the explicit formula. For general (especially continuous or non-discrete) random variables, the operation of setting one marginal to a prescribed value while preserving the remainder of the joint is not uniquely determined by the observed distribution; multiple interventional measures can realize the same marginal constraint. The manuscript must therefore supply an explicit completion rule, prove that the rule is canonical, and verify that the resulting atoms remain non-negative and obey the complete Williams-Beer axioms on every joint (see the construction that follows the definition of the do-operation and the subsequent theorem claiming axiom satisfaction).

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments. The major comment concerns the definition, uniqueness, and scope of the do-operation. We address this point below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: The do-operation is the load-bearing primitive for the explicit formula. For general (especially continuous or non-discrete) random variables, the operation of setting one marginal to a prescribed value while preserving the remainder of the joint is not uniquely determined by the observed distribution; multiple interventional measures can realize the same marginal constraint. The manuscript must therefore supply an explicit completion rule, prove that the rule is canonical, and verify that the resulting atoms remain non-negative and obey the complete Williams-Beer axioms on every joint (see the construction that follows the definition of the do-operation and the subsequent theorem claiming axiom satisfaction).

    Authors: We appreciate this observation. The manuscript develops the do-operation and the explicit formula primarily in the setting of discrete random variables with finite support, which is the standard setting for the Williams-Beer PID axioms and most subsequent work in the field. In this discrete case, the do-operation is defined by adjusting the probability mass function to enforce the desired marginal while keeping the conditional distributions of the remaining variables unchanged; this yields a unique interventional distribution. We will revise the manuscript to state this completion rule explicitly at the definition of the do-operation, include a short proof of its uniqueness (canonicity) under the discrete assumption, and confirm that the subsequent theorem establishing non-negativity and axiom compliance applies to all finite discrete joints. We agree that the continuous case requires additional specification and will add a remark noting this as a direction for future research. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation rests on newly introduced do-operation

full rationale

The paper introduces a novel do-operation (distinct from existing approaches) to construct an explicit formula for the PID atoms that satisfies the Williams-Beer axioms. This provides independent grounding rather than reducing the result to prior definitions, fitted parameters, or self-citations by construction. No load-bearing self-citation chains, self-definitional reductions, or renamings of known results are exhibited in the derivation chain. The central claim is self-contained against the new operation's properties.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Central claim rests on the introduction of the do-operation as the mechanism that yields explicit atom expressions meeting the axioms.

axioms (1)
  • domain assumption Williams and Beer's operational axioms for unique, redundant, and synergistic information atoms must be satisfied
    The paper states the formula satisfies these axioms; they are taken as the target properties to meet.
invented entities (1)
  • do-operation no independent evidence
    purpose: Operation over the variable system that sets a certain marginal to a desired value, used to define the information atoms
    Presented as the key new construct, distinct from existing approaches, that enables the explicit formula.

pith-pipeline@v0.9.0 · 5710 in / 1223 out tokens · 24164 ms · 2026-05-24T04:09:04.698899+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · 2 internal anchors

  1. [1]

    A mathematical theory of communi ca- tion

    Claude Elwood Shannon. A mathematical theory of communi ca- tion. ACM SIGMOBILE mobile computing and communications review , 5(1):3–55, 2001

  2. [2]

    Information theoretical analysis of m ultivariate corre- lation

    Satosi Watanabe. Information theoretical analysis of m ultivariate corre- lation. IBM Journal of research and development , 4(1):66–82, 1960

  3. [3]

    Nonnegative Decomposition of Multivariate Information

    Paul L Williams and Randall D Beer. Nonnegative decompos ition of multivariate information. arXiv preprint arXiv:1004.2515 , 2010

  4. [4]

    The Partial Entropy Decomposition: Decomposing multivariate entropy and mutual information via pointwise common surprisal

    Robin AA Ince. The partial entropy decomposition: Decom posing multivariate entropy and mutual information via pointwise common surprisal. arXiv preprint arXiv:1702.01591 , 2017

  5. [5]

    Beyond integrated information: A t axonomy of information dynamics phenomena

    Pedro AM Mediano, Fernando Rosas, Robin L Carhart-Harri s, Anil K Seth, and Adam B Barrett. Beyond integrated information: A t axonomy of information dynamics phenomena. arXiv preprint arXiv:1909.02297 , 2019

  6. [6]

    System information decomposition

    Aobo Lyu, Bing Y uan, Ou Deng, Mingzhe Y ang, Andrew Clark, and Jiang Zhang. System information decomposition. arXiv preprint arXiv:2306.08288, 2023

  7. [7]

    Generalized decomposition of multivar iate informa- tion

    Thomas F V arley. Generalized decomposition of multivar iate informa- tion. arXiv preprint arXiv:2309.08003 , 2023

  8. [8]

    Intersection information based on c ommon randomness

    Virgil Griffith, Edwin KP Chong, Ryan G James, Christophe r J Ellison, and James P Crutchfield. Intersection information based on c ommon randomness. Entropy, 16(4):1985–2000, 2014

  9. [9]

    Measuring multivariate redundant inform ation with pointwise common change in surprisal

    Robin AA Ince. Measuring multivariate redundant inform ation with pointwise common change in surprisal. Entropy, 19(7):318, 2017

  10. [10]

    Shared information—new insights and problems in decomposi ng infor- mation in complex systems

    Nils Bertschinger, Johannes Rauh, Eckehard Olbrich, a nd Jürgen Jost. Shared information—new insights and problems in decomposi ng infor- mation in complex systems. In Proceedings of the European conference on complex systems 2012 , pages 251–269. Springer, 2013

  11. [11]

    Biva riate measure of redundant information

    Malte Harder, Christoph Salge, and Daniel Polani. Biva riate measure of redundant information. Physical Review E , 87(1):012130, 2013

  12. [12]

    Quantifying unique information

    Nils Bertschinger, Johannes Rauh, Eckehard Olbrich, J ürgen Jost, and Nihat Ay. Quantifying unique information. Entropy, 16(4):2161–2183, 2014

  13. [13]

    Computing the unique information

    Pradeep Kr Banerjee, Johannes Rauh, and Guido Montúfar . Computing the unique information. In 2018 IEEE International Symposium on Information Theory (ISIT) , pages 141–145. IEEE, 2018

  14. [14]

    S ynergy, redun- dancy, and independence in population codes

    Elad Schneidman, William Bialek, and Michael J Berry. S ynergy, redun- dancy, and independence in population codes. Journal of Neuroscience, 23(37):11539–11553, 2003

  15. [15]

    Partial entropy decomposition reveals higher-order infor mation struc- tures in human brain activity

    Thomas F V arley, Maria Pope, Maria Grazia, Joshua, and O laf Sporns. Partial entropy decomposition reveals higher-order infor mation struc- tures in human brain activity. Proceedings of the National Academy of Sciences, 120(30):e2300888120, 2023

  16. [16]

    Da ta disclosure under perfect sample privacy

    Borzoo Rassouli, Fernando E Rosas, and Deniz Gündüz. Da ta disclosure under perfect sample privacy. IEEE Transactions on Information F orensics and Security, 15:2012–2025, 2019

  17. [17]

    Demystifying loc al and global fairness trade-offs in federated learning using partial in formation de- composition

    Faisal Hamman and Sanghamitra Dutta. Demystifying loc al and global fairness trade-offs in federated learning using partial in formation de- composition. arXiv preprint arXiv:2307.11333 , 2023

  18. [18]

    Rec- onciling emergences: An information-theoretic approach t o identify causal emergence in multivariate data

    Fernando E Rosas, Pedro AM Mediano, Henrik J Jensen, Ani l K Seth, Adam B Barrett, Robin L Carhart-Harris, and Daniel Bor . Rec- onciling emergences: An information-theoretic approach t o identify causal emergence in multivariate data. PLoS computational biology , 16(12):e1008289, 2020

  19. [19]

    Causal diagrams for empirical research

    Judea Pearl. Causal diagrams for empirical research. Biometrika, 82(4):669–688, 1995

  20. [20]

    Causality

    Judea Pearl. Causality. Cambridge University Press, 2009

  21. [21]

    Qua ntifying causal emergence shows that macro can beat micro

    Erik P Hoel, Larissa Albantakis, and Giulio Tononi. Qua ntifying causal emergence shows that macro can beat micro. Proceedings of the National Academy of Sciences , 110(49):19790–19795, 2013

  22. [22]

    Information dynamics: Its theory and application to embodied cognitive systems

    Paul L Williams. Information dynamics: Its theory and application to embodied cognitive systems . PhD thesis, Indiana University, 2011

  23. [23]

    Quantifying synergi stic mutual information

    Virgil Griffith and Christof Koch. Quantifying synergi stic mutual information. In Guided self-organization: inception , pages 159–190. Springer, 2014

  24. [24]

    Axiomatic characterizations of informa tion measures

    Imre Csiszár. Axiomatic characterizations of informa tion measures. Entropy, 10(3):261–273, 2008

  25. [25]

    Towards a synergy-based approach to measuring information modificat ion

    Joseph T Lizier, Benjamin Flecker, and Paul L Williams. Towards a synergy-based approach to measuring information modificat ion. In 2013 IEEE Symposium on Artificial Life (ALIFE) , pages 43–51. IEEE, 2013

  26. [26]

    Continuity and additivity properties of information decompositions

    Johannes Rauh, Pradeep Kr Banerjee, Eckehard Olbrich, Guido Montú- far, and Jürgen Jost. Continuity and additivity properties of information decompositions. International Journal of Approximate Reasoning , 161:108979, 2023

  27. [27]

    Pointwise partial infor mation decom- positionusing the specificity and ambiguity lattices

    Conor Finn and Joseph T Lizier. Pointwise partial infor mation decom- positionusing the specificity and ambiguity lattices. Entropy, 20(4):297, 2018

  28. [28]

    Quantifying redundant in formation in predicting a target random variable

    Virgil Griffith and Tracey Ho. Quantifying redundant in formation in predicting a target random variable. Entropy, 17(7):4644–4653, 2015

  29. [29]

    A novel approach to the partial info rmation decomposition

    Artemy Kolchinsky. A novel approach to the partial info rmation decomposition. Entropy, 24(3):403, 2022. APPENDIX A. Proof of the completeness of Definition 4. In this part, we will show that do-operation’s output is a probability distribution with the same marginal distribut ion as its input. Lemma 8. F orDX,Z and DC as in Definition 4, the output Pr(A, C ...

  30. [30]

    (19) Also, since X and Y are independent, we have (19) = Pr(X = x)

    = Pr(X = x|(X, Y ) = ( x, y)) Pr((X, Y ) = ( x, y)|Y = y) = Pr(X = x|Y = y). (19) Also, since X and Y are independent, we have (19) = Pr(X = x). Therefore, we have Un(X → Z|Y ) = I(X ′; Z|Y ) = H(X) = I(X; Z), which by Definition 2 implies that Red(X, Y → Z) = I(X; Z) − Un(X → Z|Y ) = 0