Recognition: no theorem link
Maximum Entropy of Sums of Independent Ternary Random Variables
Pith reviewed 2026-05-13 05:15 UTC · model grok-4.3
The pith
The entropy of the sum of independent ternary random variables is maximized when the first n-1 are uniform on {0,2} and the last has a specific tuned probability for value 1.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
If X1 through Xn are independent random variables each taking values in {0,1,2}, then the entropy of Sn = X1 + ... + Xn is maximized when X1 to X_{n-1} are each uniform on {0,2} and Xn satisfies Prob(Xn=0) = Prob(Xn=2) = w/2, Prob(Xn=1) = 1-w, where w equals (1 + 2^{-H(Bn) + H(B_{n-1})})^{-1} and Bm denotes a binomial random variable with m trials and success probability 1/2.
What carries the argument
Application of the Hermite-Biehler theorem together with Newton's inequalities and Yu's maximum-entropy result for ultra-log-concave distributions to the probability generating function of the ternary convolution.
If this is right
- The maximizing distributions admit a recursive description that yields an explicit formula for any fixed n.
- The maximum entropy value equals the entropy of a certain binomial plus a correction term involving the weight w.
- The result supplies the ternary analogue of the Shepp-Olkin-Mateev theorem for the binary case.
- Any other choice of independent ternary distributions yields strictly smaller sum entropy.
Where Pith is reading between the lines
- The same recursive pattern may produce candidate maximizers for alphabets larger than three.
- The explicit form could be used to bound achievable rates in discrete memoryless channels whose input symbols are ternary.
- Direct numerical maximization for small n would serve as an immediate consistency check of the closed-form expression.
Load-bearing premise
The analytic and inequality tools developed for binary or ultra-log-concave cases extend without change to the entropy functional of a sum whose components have three-point support.
What would settle it
For n=2, compute the numerical entropy of the sum under the claimed pair of distributions and compare it against the entropy obtained from any other pair, such as both variables uniform on {0,1,2}; the claimed pair must be strictly larger.
Figures
read the original abstract
The classical problem of maximizing the Shannon entropy of a sum of independent random variables supported on a finite alphabet is considered and settled in the ternary case. Namely, the following theorem is established: if \(X_1,\ldots,X_n\) are independent random variables taking values in \(\{0,1,2\}\), then the entropy of \(S_n=X_1+\cdots+X_n\) is maximized when \(X_1,\ldots,X_{n-1}\) are uniform on \(\{0,2\}\) and the probability mass function of \(X_n\) is given by \(\Prob(X_n=0) = \Prob(X_n=2) = w/2\), \(\Prob(X_n=1) = 1-w\), where \(w = \big(1 + 2^{-H(B_n)+H(B_{n-1})}\big)^{-1}\) and \(B_m\sim \Bin(m,1/2)\). The statement can be seen as an extension to ternary alphabets of the Shepp--Olkin--Mateev theorem. The proof uses the Hermite--Biehler theorem, Newton's inequalities, and Yu's maximum-entropy theorem for ultra-log-concave distributions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proves that for independent random variables X1,...,Xn each taking values in {0,1,2}, the Shannon entropy H(Sn) where Sn = X1+...+Xn is maximized when X1,...,X_{n-1} are uniform on {0,2} and Xn has pmf Prob(Xn=0)=Prob(Xn=2)=w/2, Prob(Xn=1)=1-w with w=(1+2^{-H(Bn)+H(B_{n-1})})^{-1} and Bm~Bin(m,1/2). The result extends the Shepp-Olkin-Mateev theorem to the ternary alphabet. The proof invokes the Hermite-Biehler theorem for root interlacing, Newton's inequalities for log-concavity of coefficients, and Yu's maximum-entropy theorem for ultra-log-concave distributions applied to the pmf of Sn.
Significance. If the central claim holds, the paper provides a complete and explicit characterization of the maximum-entropy configuration for sums of independent ternary random variables, directly analogous to the binary case. This settles an open extension of a classical problem in information theory and supplies a concrete construction that can be used to derive tight bounds on entropy of discrete sums. The reliance on classical polynomial inequalities (Hermite-Biehler, Newton) and Yu's ULC result is a methodological strength when the required properties are verified.
major comments (2)
- [Proof of the main theorem] Proof of the main theorem: the argument applies Yu's maximum-entropy theorem for ultra-log-concave distributions directly to the pmf of Sn. However, the generating function of Sn is the product of n quadratic factors; the manuscript must explicitly verify that the chosen configuration (uniform {0,2} for the first n-1 variables together with the w-parameterized last variable) produces a coefficient sequence that is ultra-log-concave (and real-rooted). Convolution of real-rooted quadratics does not automatically preserve ultra-log-concavity on a ternary support, so this preservation step is load-bearing for invoking Yu's theorem and must be shown rather than assumed by analogy with the binary case.
- [Section 3] Section 3 (or wherever the generating-function analysis appears): the application of the Hermite-Biehler theorem and Newton's inequalities requires confirming that the specific pmf of Sn under the proposed maximizer satisfies the real-rootedness and log-concavity hypotheses. The manuscript should include a lemma establishing these properties for the product polynomial with the given w, because failure of strict ultra-log-concavity would invalidate the appeal to Yu's theorem.
minor comments (2)
- [Abstract and introduction] The definition of w is given in terms of binary binomial entropies; a brief remark clarifying why this particular functional form arises from the entropy objective (rather than being fitted post hoc) would improve readability.
- [Preliminaries] Ensure that the statement of Yu's theorem is quoted with its precise hypotheses (ultra-log-concavity on the support) so that the reader can immediately see the required conditions on the pmf of Sn.
Simulated Author's Rebuttal
We thank the referee for the thorough review and valuable suggestions. We have carefully considered the major comments and will revise the manuscript to provide the explicit verifications requested, thereby strengthening the proof of the main theorem.
read point-by-point responses
-
Referee: Proof of the main theorem: the argument applies Yu's maximum-entropy theorem for ultra-log-concave distributions directly to the pmf of Sn. However, the generating function of Sn is the product of n quadratic factors; the manuscript must explicitly verify that the chosen configuration (uniform {0,2} for the first n-1 variables together with the w-parameterized last variable) produces a coefficient sequence that is ultra-log-concave (and real-rooted). Convolution of real-rooted quadratics does not automatically preserve ultra-log-concavity on a ternary support, so this preservation step is load-bearing for invoking Yu's theorem and must be shown rather than assumed by analogy with the binary case.
Authors: We agree with the referee that the ultra-log-concavity and real-rootedness must be explicitly verified for the specific configuration to justify the application of Yu's theorem. In the revised version, we will insert a dedicated lemma immediately before the main proof that establishes these properties for the generating function under the given choice of distributions. Specifically, we will show by induction on n that the coefficient sequence of the product polynomial is real-rooted and ultra-log-concave, using the recursive structure induced by the uniform {0,2} variables and the entropy-balancing choice of w. This removes any reliance on analogy with the binary case and directly addresses the concern. revision: yes
-
Referee: Section 3 (or wherever the generating-function analysis appears): the application of the Hermite-Biehler theorem and Newton's inequalities requires confirming that the specific pmf of Sn under the proposed maximizer satisfies the real-rootedness and log-concavity hypotheses. The manuscript should include a lemma establishing these properties for the product polynomial with the given w, because failure of strict ultra-log-concavity would invalidate the appeal to Yu's theorem.
Authors: We acknowledge that the applications of the Hermite-Biehler theorem and Newton's inequalities require confirmation for the specific pmf. We will add a supporting lemma in the generating-function section that applies these tools to the product of the quadratic factors with the parameterized last variable. The lemma will prove real-rootedness and the log-concavity of the coefficients, ensuring the hypotheses for Yu's theorem are met. The choice of w, defined via the binomial entropies, is key to this verification. revision: yes
Circularity Check
No significant circularity; derivation relies on external theorems without self-referential reduction to inputs.
full rationale
The paper proves the stated maximum-entropy configuration for the ternary sum by direct application of the Hermite-Biehler theorem, Newton's inequalities, and Yu's maximum-entropy theorem for ultra-log-concave distributions. The parameter w is defined explicitly in terms of the independent binomial entropies H(B_n) and H(B_{n-1}), which are not obtained by fitting to the target ternary entropy H(S_n). No equation or step reduces the claimed result to a fitted input, self-citation chain, or definitional tautology. The derivation chain is therefore self-contained against the cited external benchmarks.
Axiom & Free-Parameter Ledger
axioms (3)
- standard math Hermite-Biehler theorem
- standard math Newton's inequalities
- standard math Yu's maximum-entropy theorem for ultra-log-concave distributions
Reference graph
Works this paper leans on
-
[1]
T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed., John Wiley and Sons, Inc., 2006
work page 2006
-
[2]
Harremo\"es, ``Binomial and P oisson distributions as maximum entropy distributions,'' IEEE Trans
P. Harremo\"es, ``Binomial and P oisson distributions as maximum entropy distributions,'' IEEE Trans. Inform. Theory 47 (5) (2001), 2039--2041. https://doi.org/10.1109/18.930936
-
[3]
E. Hillion and O. T. Johnson, ``Discrete versions of the transport equation and the S hepp-- O lkin conjecture,'' Ann. Probab. 44 (1) (2016), 276--306. https://doi.org/10.1214/14-AOP973
-
[4]
E. Hillion and O. T. Johnson, ``A proof of the S hepp-- O lkin entropy concavity conjecture,'' Bernoulli 23 (4B) (2017), 3638--3649. https://doi.org/10.3150/16-BEJ860
-
[5]
E. Hillion and O. T. Johnson, ``A proof of the S hepp-- O lkin entropy monotonicity conjecture,'' Electron. J. Probab. 24 (2019), article no. 126. https://doi.org/10.1214/19-EJP380
-
[6]
E. T. Jaynes, ``Information theory and statistical mechanics,'' Phys. Rev. 106(4) (1957), 620--630. https://doi.org/10.1103/PhysRev.106.620
-
[7]
O. T. Johnson, ``Log-concavity and the maximum entropy property of the P oisson distribution,'' Stochastic Process. Appl. 117 (6) (2007), 791--802. https://doi.org/10.1016/j.spa.2006.10.006
-
[8]
O. Johnson, I. Kontoyiannis, and M. Madiman, ``Log-concavity, ultra-log-concavity, and a maximum entropy property of discrete compound Poisson measures,'' Discrete Applied Math. 161 (9) (2013), 1232--1250. https://doi.org/10.1016/j.dam.2011.08.025
-
[9]
M. Kova cevi\'c, ``On the maximum entropy of a sum of independent discrete random variables,'' Theory Probab. Appl. 66 (3) (2021), 482--487. https://doi.org/10.1137/S0040585X97T99054X
-
[10]
Mateev, ``The entropy of the multinomial distribution,'' Theory Probab
P. Mateev, ``The entropy of the multinomial distribution,'' Theory Probab. Appl. 23 (1) (1978), 188--190. https://doi.org/10.1137/1123020
-
[11]
Ordentlich, ``Maximizing the entropy of a sum of independent bounded random variables,'' IEEE Trans
E. Ordentlich, ``Maximizing the entropy of a sum of independent bounded random variables,'' IEEE Trans. Inform. Theory 52 (5) (2006), 2176--2181. https://doi.org/10.1109/TIT.2006.872858
-
[12]
Q. I. Rahman and G. Schmeisser, Analytic Theory of Polynomials, Oxford University Press, 2002
work page 2002
-
[13]
L. A. Shepp and I. Olkin, ``Entropy of the sum of independent B ernoulli random variables and of the multinomial distribution,'' Tech. Report 131, Stanford University, 1978. Reprinted in Contributions to probability, Academic Press, New York, 1981, pp. 201--206. https://doi.org/10.1016/B978-0-12-274460-0.50022-9
-
[14]
R. P. Stanley, ``Log-concave and unimodal sequences in algebra, combinatorics, and geometry,'' Ann. New York Acad. Sci. 576 (1989), 500--535. https://doi.org/10.1111/j.1749-6632.1989.tb16434.x
-
[15]
Y. Yu, ``Maximum entropy for sums of symmetric and bounded random variables: a short derivation,'' IEEE Trans. Inform. Theory 54 (4) (2008), 1818--1819. https://doi.org/10.1109/TIT.2008.917660
-
[16]
Yu, ``On the maximum entropy properties of the binomial distribution,'' IEEE Trans
Y. Yu, ``On the maximum entropy properties of the binomial distribution,'' IEEE Trans. Inform. Theory 54 (7) (2008), 3351--3353. https://doi.org/10.1109/TIT.2008.924715
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.