Explicit Formula for Partial Information Decomposition
Pith reviewed 2026-05-24 04:09 UTC · model grok-4.3
The pith
A do-operation that sets marginals to chosen values supplies the first explicit formula for the atoms of partial information decomposition.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that the do-operation—setting a prescribed marginal to a target distribution—produces explicit expressions for the unique, redundant, and synergistic information atoms that simultaneously obey Williams and Beer’s axioms together with the additional properties identified in subsequent work.
What carries the argument
The do-operation, which sets a chosen marginal distribution to a desired value while preserving the remaining joint structure.
If this is right
- The formula directly computes unique, redundant, and synergistic information for any finite alphabet size.
- The resulting atoms satisfy the original Williams-Beer axioms by construction.
- The atoms also obey the additional consistency properties required by later studies.
- The method applies uniformly to arbitrary random variables rather than special cases.
- No auxiliary optimization or iterative procedure is needed once the joint distribution is given.
Where Pith is reading between the lines
- The same construction may extend to continuous variables if the do-operation can be defined via densities or measures.
- Because the operation mirrors Pearl’s do-calculus, the atoms could acquire causal interpretations in systems already analyzed with interventions.
- Implementations on small discrete systems would allow direct numerical checks against exhaustive enumeration of all possible atom assignments.
- If the formula scales, it could serve as a building block for higher-order decompositions involving more than three variables.
Load-bearing premise
The do-operation must be unambiguously defined for every joint distribution of random variables and must generate atoms that meet every listed axiom without requiring further unstated restrictions.
What would settle it
A concrete joint distribution on three or four binary variables where the formula produces atom values that violate at least one Williams-Beer axiom or disagree with the unique known values in a fully enumerated small system.
Figures
read the original abstract
Mutual information between two random variables is a well-studied notion, whose understanding is fairly complete. Mutual information between one random variable and a pair of other random variables, however, is a far more involved notion. Specifically, Shannon's mutual information does not capture fine-grained interactions between those three variables, resulting in limited insights in complex systems. To capture these fine-grained interactions, in 2010 Williams and Beer proposed to decompose this mutual information to information atoms, called unique, redundant, and synergistic, and proposed several operational axioms that these atoms must satisfy. In spite of numerous efforts, a general formula which satisfies these axioms has yet to be found. Inspired by Judea Pearl's do-calculus, we resolve this open problem by introducing the do-operation, an operation over the variable system which sets a certain marginal to a desired value, which is distinct from any existing approaches. Using this operation, we provide the first explicit formula for calculating the information atoms so that Williams and Beer's axioms are satisfied, as well as additional properties from subsequent studies in the field.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims to resolve the long-standing open problem of finding an explicit formula for the partial information decomposition (PID) atoms (unique, redundant, and synergistic) that satisfy the Williams-Beer axioms from 2010, along with additional properties identified in later work. It introduces a novel 'do-operation' (inspired by Pearl's do-calculus) that sets a chosen marginal to a desired value and uses this operation to construct the atoms via an explicit formula.
Significance. If the do-operation is shown to be canonically defined and the resulting formula satisfies non-negativity together with the full set of Williams-Beer lattice axioms for arbitrary joint distributions, the result would be a substantial contribution to information theory, supplying the first closed-form solution to a problem that has remained unresolved despite extensive prior effort.
major comments (1)
- [Definition and properties of the do-operation; theorem on axiom compliance] The do-operation is the load-bearing primitive for the explicit formula. For general (especially continuous or non-discrete) random variables, the operation of setting one marginal to a prescribed value while preserving the remainder of the joint is not uniquely determined by the observed distribution; multiple interventional measures can realize the same marginal constraint. The manuscript must therefore supply an explicit completion rule, prove that the rule is canonical, and verify that the resulting atoms remain non-negative and obey the complete Williams-Beer axioms on every joint (see the construction that follows the definition of the do-operation and the subsequent theorem claiming axiom satisfaction).
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments. The major comment concerns the definition, uniqueness, and scope of the do-operation. We address this point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: The do-operation is the load-bearing primitive for the explicit formula. For general (especially continuous or non-discrete) random variables, the operation of setting one marginal to a prescribed value while preserving the remainder of the joint is not uniquely determined by the observed distribution; multiple interventional measures can realize the same marginal constraint. The manuscript must therefore supply an explicit completion rule, prove that the rule is canonical, and verify that the resulting atoms remain non-negative and obey the complete Williams-Beer axioms on every joint (see the construction that follows the definition of the do-operation and the subsequent theorem claiming axiom satisfaction).
Authors: We appreciate this observation. The manuscript develops the do-operation and the explicit formula primarily in the setting of discrete random variables with finite support, which is the standard setting for the Williams-Beer PID axioms and most subsequent work in the field. In this discrete case, the do-operation is defined by adjusting the probability mass function to enforce the desired marginal while keeping the conditional distributions of the remaining variables unchanged; this yields a unique interventional distribution. We will revise the manuscript to state this completion rule explicitly at the definition of the do-operation, include a short proof of its uniqueness (canonicity) under the discrete assumption, and confirm that the subsequent theorem establishing non-negativity and axiom compliance applies to all finite discrete joints. We agree that the continuous case requires additional specification and will add a remark noting this as a direction for future research. revision: yes
Circularity Check
No significant circularity; derivation rests on newly introduced do-operation
full rationale
The paper introduces a novel do-operation (distinct from existing approaches) to construct an explicit formula for the PID atoms that satisfies the Williams-Beer axioms. This provides independent grounding rather than reducing the result to prior definitions, fitted parameters, or self-citations by construction. No load-bearing self-citation chains, self-definitional reductions, or renamings of known results are exhibited in the derivation chain. The central claim is self-contained against the new operation's properties.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Williams and Beer's operational axioms for unique, redundant, and synergistic information atoms must be satisfied
invented entities (1)
-
do-operation
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
do(DX,Z | DC) ≜ DA,C with Pr(A=x,C=z)=Pr(X=x|Z=z)Pr(C=z); Un(X→Z|Y)=∑Pr(Y=y)I(Ay;Cy)
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Lemma 1: H(X|Z)=H(X′|Z)=H(X′|Z,Y) and H(X′)=H(X)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
A mathematical theory of communi ca- tion
Claude Elwood Shannon. A mathematical theory of communi ca- tion. ACM SIGMOBILE mobile computing and communications review , 5(1):3–55, 2001
work page 2001
-
[2]
Information theoretical analysis of m ultivariate corre- lation
Satosi Watanabe. Information theoretical analysis of m ultivariate corre- lation. IBM Journal of research and development , 4(1):66–82, 1960
work page 1960
-
[3]
Nonnegative Decomposition of Multivariate Information
Paul L Williams and Randall D Beer. Nonnegative decompos ition of multivariate information. arXiv preprint arXiv:1004.2515 , 2010
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[4]
Robin AA Ince. The partial entropy decomposition: Decom posing multivariate entropy and mutual information via pointwise common surprisal. arXiv preprint arXiv:1702.01591 , 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[5]
Beyond integrated information: A t axonomy of information dynamics phenomena
Pedro AM Mediano, Fernando Rosas, Robin L Carhart-Harri s, Anil K Seth, and Adam B Barrett. Beyond integrated information: A t axonomy of information dynamics phenomena. arXiv preprint arXiv:1909.02297 , 2019
-
[6]
System information decomposition
Aobo Lyu, Bing Y uan, Ou Deng, Mingzhe Y ang, Andrew Clark, and Jiang Zhang. System information decomposition. arXiv preprint arXiv:2306.08288, 2023
-
[7]
Generalized decomposition of multivar iate informa- tion
Thomas F V arley. Generalized decomposition of multivar iate informa- tion. arXiv preprint arXiv:2309.08003 , 2023
-
[8]
Intersection information based on c ommon randomness
Virgil Griffith, Edwin KP Chong, Ryan G James, Christophe r J Ellison, and James P Crutchfield. Intersection information based on c ommon randomness. Entropy, 16(4):1985–2000, 2014
work page 1985
-
[9]
Measuring multivariate redundant inform ation with pointwise common change in surprisal
Robin AA Ince. Measuring multivariate redundant inform ation with pointwise common change in surprisal. Entropy, 19(7):318, 2017
work page 2017
-
[10]
Shared information—new insights and problems in decomposi ng infor- mation in complex systems
Nils Bertschinger, Johannes Rauh, Eckehard Olbrich, a nd Jürgen Jost. Shared information—new insights and problems in decomposi ng infor- mation in complex systems. In Proceedings of the European conference on complex systems 2012 , pages 251–269. Springer, 2013
work page 2012
-
[11]
Biva riate measure of redundant information
Malte Harder, Christoph Salge, and Daniel Polani. Biva riate measure of redundant information. Physical Review E , 87(1):012130, 2013
work page 2013
-
[12]
Quantifying unique information
Nils Bertschinger, Johannes Rauh, Eckehard Olbrich, J ürgen Jost, and Nihat Ay. Quantifying unique information. Entropy, 16(4):2161–2183, 2014
work page 2014
-
[13]
Computing the unique information
Pradeep Kr Banerjee, Johannes Rauh, and Guido Montúfar . Computing the unique information. In 2018 IEEE International Symposium on Information Theory (ISIT) , pages 141–145. IEEE, 2018
work page 2018
-
[14]
S ynergy, redun- dancy, and independence in population codes
Elad Schneidman, William Bialek, and Michael J Berry. S ynergy, redun- dancy, and independence in population codes. Journal of Neuroscience, 23(37):11539–11553, 2003
work page 2003
-
[15]
Partial entropy decomposition reveals higher-order infor mation struc- tures in human brain activity
Thomas F V arley, Maria Pope, Maria Grazia, Joshua, and O laf Sporns. Partial entropy decomposition reveals higher-order infor mation struc- tures in human brain activity. Proceedings of the National Academy of Sciences, 120(30):e2300888120, 2023
work page 2023
-
[16]
Da ta disclosure under perfect sample privacy
Borzoo Rassouli, Fernando E Rosas, and Deniz Gündüz. Da ta disclosure under perfect sample privacy. IEEE Transactions on Information F orensics and Security, 15:2012–2025, 2019
work page 2012
-
[17]
Faisal Hamman and Sanghamitra Dutta. Demystifying loc al and global fairness trade-offs in federated learning using partial in formation de- composition. arXiv preprint arXiv:2307.11333 , 2023
-
[18]
Fernando E Rosas, Pedro AM Mediano, Henrik J Jensen, Ani l K Seth, Adam B Barrett, Robin L Carhart-Harris, and Daniel Bor . Rec- onciling emergences: An information-theoretic approach t o identify causal emergence in multivariate data. PLoS computational biology , 16(12):e1008289, 2020
work page 2020
-
[19]
Causal diagrams for empirical research
Judea Pearl. Causal diagrams for empirical research. Biometrika, 82(4):669–688, 1995
work page 1995
- [20]
-
[21]
Qua ntifying causal emergence shows that macro can beat micro
Erik P Hoel, Larissa Albantakis, and Giulio Tononi. Qua ntifying causal emergence shows that macro can beat micro. Proceedings of the National Academy of Sciences , 110(49):19790–19795, 2013
work page 2013
-
[22]
Information dynamics: Its theory and application to embodied cognitive systems
Paul L Williams. Information dynamics: Its theory and application to embodied cognitive systems . PhD thesis, Indiana University, 2011
work page 2011
-
[23]
Quantifying synergi stic mutual information
Virgil Griffith and Christof Koch. Quantifying synergi stic mutual information. In Guided self-organization: inception , pages 159–190. Springer, 2014
work page 2014
-
[24]
Axiomatic characterizations of informa tion measures
Imre Csiszár. Axiomatic characterizations of informa tion measures. Entropy, 10(3):261–273, 2008
work page 2008
-
[25]
Towards a synergy-based approach to measuring information modificat ion
Joseph T Lizier, Benjamin Flecker, and Paul L Williams. Towards a synergy-based approach to measuring information modificat ion. In 2013 IEEE Symposium on Artificial Life (ALIFE) , pages 43–51. IEEE, 2013
work page 2013
-
[26]
Continuity and additivity properties of information decompositions
Johannes Rauh, Pradeep Kr Banerjee, Eckehard Olbrich, Guido Montú- far, and Jürgen Jost. Continuity and additivity properties of information decompositions. International Journal of Approximate Reasoning , 161:108979, 2023
work page 2023
-
[27]
Pointwise partial infor mation decom- positionusing the specificity and ambiguity lattices
Conor Finn and Joseph T Lizier. Pointwise partial infor mation decom- positionusing the specificity and ambiguity lattices. Entropy, 20(4):297, 2018
work page 2018
-
[28]
Quantifying redundant in formation in predicting a target random variable
Virgil Griffith and Tracey Ho. Quantifying redundant in formation in predicting a target random variable. Entropy, 17(7):4644–4653, 2015
work page 2015
-
[29]
A novel approach to the partial info rmation decomposition
Artemy Kolchinsky. A novel approach to the partial info rmation decomposition. Entropy, 24(3):403, 2022. APPENDIX A. Proof of the completeness of Definition 4. In this part, we will show that do-operation’s output is a probability distribution with the same marginal distribut ion as its input. Lemma 8. F orDX,Z and DC as in Definition 4, the output Pr(A, C ...
work page 2022
-
[30]
(19) Also, since X and Y are independent, we have (19) = Pr(X = x)
= Pr(X = x|(X, Y ) = ( x, y)) Pr((X, Y ) = ( x, y)|Y = y) = Pr(X = x|Y = y). (19) Also, since X and Y are independent, we have (19) = Pr(X = x). Therefore, we have Un(X → Z|Y ) = I(X ′; Z|Y ) = H(X) = I(X; Z), which by Definition 2 implies that Red(X, Y → Z) = I(X; Z) − Un(X → Z|Y ) = 0
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.