arxiv: 2605.11831 · v1 · submitted 2026-05-12 · 💻 cs.IT · cs.DM· math.IT· math.PR

Recognition: no theorem link

Maximum Entropy of Sums of Independent Ternary Random Variables

Mladen Kova\v{c}evi\'c

Authors on Pith no claims yet

Pith reviewed 2026-05-13 05:15 UTC · model grok-4.3

classification 💻 cs.IT cs.DMmath.ITmath.PR

keywords maximum entropysum of independent random variablesternary random variablesShannon entropyShepp-Olkin-Mateev theoremultra-log-concave distributionsHermite-Biehler theoremNewton inequalities

0 comments

The pith

The entropy of the sum of independent ternary random variables is maximized when the first n-1 are uniform on {0,2} and the last has a specific tuned probability for value 1.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves an explicit form for the independent distributions on {0,1,2} that achieve the largest possible Shannon entropy for their sum. It shows this occurs by letting the first n-1 variables each flip equally between 0 and 2, while the final variable sets its probability for the middle outcome 1 according to a weight derived from the entropies of two binomial distributions. This construction extends the known binary-alphabet result to the ternary setting and supplies a concrete maximizing joint law. Readers care because the explicit maximizer permits direct calculation of entropy upper bounds for discrete sums with small alphabets.

Core claim

If X1 through Xn are independent random variables each taking values in {0,1,2}, then the entropy of Sn = X1 + ... + Xn is maximized when X1 to X_{n-1} are each uniform on {0,2} and Xn satisfies Prob(Xn=0) = Prob(Xn=2) = w/2, Prob(Xn=1) = 1-w, where w equals (1 + 2^{-H(Bn) + H(B_{n-1})})^{-1} and Bm denotes a binomial random variable with m trials and success probability 1/2.

What carries the argument

Application of the Hermite-Biehler theorem together with Newton's inequalities and Yu's maximum-entropy result for ultra-log-concave distributions to the probability generating function of the ternary convolution.

If this is right

The maximizing distributions admit a recursive description that yields an explicit formula for any fixed n.
The maximum entropy value equals the entropy of a certain binomial plus a correction term involving the weight w.
The result supplies the ternary analogue of the Shepp-Olkin-Mateev theorem for the binary case.
Any other choice of independent ternary distributions yields strictly smaller sum entropy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same recursive pattern may produce candidate maximizers for alphabets larger than three.
The explicit form could be used to bound achievable rates in discrete memoryless channels whose input symbols are ternary.
Direct numerical maximization for small n would serve as an immediate consistency check of the closed-form expression.

Load-bearing premise

The analytic and inequality tools developed for binary or ultra-log-concave cases extend without change to the entropy functional of a sum whose components have three-point support.

What would settle it

For n=2, compute the numerical entropy of the sum under the claimed pair of distributions and compare it against the entropy obtained from any other pair, such as both variables uniform on {0,1,2}; the claimed pair must be strictly larger.

Figures

Figures reproduced from arXiv: 2605.11831 by Mladen Kova\v{c}evi\'c.

**Figure 1.** Figure 1: The probability distribution of S4 in the case when the distributions of Xi are chosen according to Theorem 7. For the purpose of illustration, the probability masses assigned to even, resp. odd, values (corresponding to the case J = 0, resp. J = 1) are marked in blue, resp. red. Note that K | J = 0 ∼ B4 and K | J = 1 ∼ B3. The coefficients of E and O are precisely the (unnormalized) laws of K conditioned … view at source ↗

read the original abstract

The classical problem of maximizing the Shannon entropy of a sum of independent random variables supported on a finite alphabet is considered and settled in the ternary case. Namely, the following theorem is established: if \(X_1,\ldots,X_n\) are independent random variables taking values in \(\{0,1,2\}\), then the entropy of \(S_n=X_1+\cdots+X_n\) is maximized when \(X_1,\ldots,X_{n-1}\) are uniform on \(\{0,2\}\) and the probability mass function of \(X_n\) is given by \(\Prob(X_n=0) = \Prob(X_n=2) = w/2\), \(\Prob(X_n=1) = 1-w\), where \(w = \big(1 + 2^{-H(B_n)+H(B_{n-1})}\big)^{-1}\) and \(B_m\sim \Bin(m,1/2)\). The statement can be seen as an extension to ternary alphabets of the Shepp--Olkin--Mateev theorem. The proof uses the Hermite--Biehler theorem, Newton's inequalities, and Yu's maximum-entropy theorem for ultra-log-concave distributions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives the explicit maximizing distribution for the entropy of ternary sums, extending the binary result, but the preservation of ultra-log-concavity in the proof needs verification.

read the letter

The paper settles the maximum entropy problem for sums of independent ternary random variables by specifying the exact form of the optimizing distributions. The first n-1 variables are uniform on {0,2}, and the last one uses a parameter w derived from binomial entropies to set the probability of the middle symbol. This is new work. The binary Shepp-Olkin-Mateev theorem does not extend immediately, and the authors provide the first explicit construction for the ternary alphabet along with the formula for w. The paper does well in its use of standard tools. It invokes the Hermite-Biehler theorem, Newton's inequalities, and Yu's theorem on ultra-log-concave distributions to prove the claim. The approach is direct and avoids any obvious circularity, as the definition of w comes from known binomial entropies rather than from the entropy being maximized. The main soft spot is the extension of the analytic properties to the ternary setting. The stress-test note correctly flags that convolutions of quadratic generating functions do not automatically preserve real-rootedness or ultra-log-concavity. The paper must show that the specific choice of distributions here does preserve them; if that step is only sketched or assumed by analogy, it would be the place for referees to focus. From the abstract, it is not clear how much detail is given, but the full manuscript likely contains it. This paper is for specialists in information theory working on entropy inequalities and discrete distributions. A reader interested in applications to coding or channel capacity with ternary alphabets would find the result relevant. I would send it to peer review. The claim is concrete and the methods are established, so it is worth the time for a careful check of the proof.

Referee Report

2 major / 2 minor

Summary. The manuscript proves that for independent random variables X1,...,Xn each taking values in {0,1,2}, the Shannon entropy H(Sn) where Sn = X1+...+Xn is maximized when X1,...,X_{n-1} are uniform on {0,2} and Xn has pmf Prob(Xn=0)=Prob(Xn=2)=w/2, Prob(Xn=1)=1-w with w=(1+2^{-H(Bn)+H(B_{n-1})})^{-1} and Bm~Bin(m,1/2). The result extends the Shepp-Olkin-Mateev theorem to the ternary alphabet. The proof invokes the Hermite-Biehler theorem for root interlacing, Newton's inequalities for log-concavity of coefficients, and Yu's maximum-entropy theorem for ultra-log-concave distributions applied to the pmf of Sn.

Significance. If the central claim holds, the paper provides a complete and explicit characterization of the maximum-entropy configuration for sums of independent ternary random variables, directly analogous to the binary case. This settles an open extension of a classical problem in information theory and supplies a concrete construction that can be used to derive tight bounds on entropy of discrete sums. The reliance on classical polynomial inequalities (Hermite-Biehler, Newton) and Yu's ULC result is a methodological strength when the required properties are verified.

major comments (2)

[Proof of the main theorem] Proof of the main theorem: the argument applies Yu's maximum-entropy theorem for ultra-log-concave distributions directly to the pmf of Sn. However, the generating function of Sn is the product of n quadratic factors; the manuscript must explicitly verify that the chosen configuration (uniform {0,2} for the first n-1 variables together with the w-parameterized last variable) produces a coefficient sequence that is ultra-log-concave (and real-rooted). Convolution of real-rooted quadratics does not automatically preserve ultra-log-concavity on a ternary support, so this preservation step is load-bearing for invoking Yu's theorem and must be shown rather than assumed by analogy with the binary case.
[Section 3] Section 3 (or wherever the generating-function analysis appears): the application of the Hermite-Biehler theorem and Newton's inequalities requires confirming that the specific pmf of Sn under the proposed maximizer satisfies the real-rootedness and log-concavity hypotheses. The manuscript should include a lemma establishing these properties for the product polynomial with the given w, because failure of strict ultra-log-concavity would invalidate the appeal to Yu's theorem.

minor comments (2)

[Abstract and introduction] The definition of w is given in terms of binary binomial entropies; a brief remark clarifying why this particular functional form arises from the entropy objective (rather than being fitted post hoc) would improve readability.
[Preliminaries] Ensure that the statement of Yu's theorem is quoted with its precise hypotheses (ultra-log-concavity on the support) so that the reader can immediately see the required conditions on the pmf of Sn.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and valuable suggestions. We have carefully considered the major comments and will revise the manuscript to provide the explicit verifications requested, thereby strengthening the proof of the main theorem.

read point-by-point responses

Referee: Proof of the main theorem: the argument applies Yu's maximum-entropy theorem for ultra-log-concave distributions directly to the pmf of Sn. However, the generating function of Sn is the product of n quadratic factors; the manuscript must explicitly verify that the chosen configuration (uniform {0,2} for the first n-1 variables together with the w-parameterized last variable) produces a coefficient sequence that is ultra-log-concave (and real-rooted). Convolution of real-rooted quadratics does not automatically preserve ultra-log-concavity on a ternary support, so this preservation step is load-bearing for invoking Yu's theorem and must be shown rather than assumed by analogy with the binary case.

Authors: We agree with the referee that the ultra-log-concavity and real-rootedness must be explicitly verified for the specific configuration to justify the application of Yu's theorem. In the revised version, we will insert a dedicated lemma immediately before the main proof that establishes these properties for the generating function under the given choice of distributions. Specifically, we will show by induction on n that the coefficient sequence of the product polynomial is real-rooted and ultra-log-concave, using the recursive structure induced by the uniform {0,2} variables and the entropy-balancing choice of w. This removes any reliance on analogy with the binary case and directly addresses the concern. revision: yes
Referee: Section 3 (or wherever the generating-function analysis appears): the application of the Hermite-Biehler theorem and Newton's inequalities requires confirming that the specific pmf of Sn under the proposed maximizer satisfies the real-rootedness and log-concavity hypotheses. The manuscript should include a lemma establishing these properties for the product polynomial with the given w, because failure of strict ultra-log-concavity would invalidate the appeal to Yu's theorem.

Authors: We acknowledge that the applications of the Hermite-Biehler theorem and Newton's inequalities require confirmation for the specific pmf. We will add a supporting lemma in the generating-function section that applies these tools to the product of the quadratic factors with the parameterized last variable. The lemma will prove real-rootedness and the log-concavity of the coefficients, ensuring the hypotheses for Yu's theorem are met. The choice of w, defined via the binomial entropies, is key to this verification. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external theorems without self-referential reduction to inputs.

full rationale

The paper proves the stated maximum-entropy configuration for the ternary sum by direct application of the Hermite-Biehler theorem, Newton's inequalities, and Yu's maximum-entropy theorem for ultra-log-concave distributions. The parameter w is defined explicitly in terms of the independent binomial entropies H(B_n) and H(B_{n-1}), which are not obtained by fitting to the target ternary entropy H(S_n). No equation or step reduces the claimed result to a fitted input, self-citation chain, or definitional tautology. The derivation chain is therefore self-contained against the cited external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 0 invented entities

The central claim rests on the applicability of three established theorems to the entropy maximization setting; no free parameters are fitted and no new entities are postulated.

axioms (3)

standard math Hermite-Biehler theorem
Invoked in the proof as stated in the abstract.
standard math Newton's inequalities
Applied to the relevant polynomials in the proof.
standard math Yu's maximum-entropy theorem for ultra-log-concave distributions
Used to bound or characterize the entropy of the sum.

pith-pipeline@v0.9.0 · 5517 in / 1554 out tokens · 55503 ms · 2026-05-13T05:15:43.440148+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed., John Wiley and Sons, Inc., 2006

work page 2006
[2]

Harremo\"es, ``Binomial and P oisson distributions as maximum entropy distributions,'' IEEE Trans

P. Harremo\"es, ``Binomial and P oisson distributions as maximum entropy distributions,'' IEEE Trans. Inform. Theory 47 (5) (2001), 2039--2041. https://doi.org/10.1109/18.930936

work page doi:10.1109/18.930936 2001
[3]

Hillion and O

E. Hillion and O. T. Johnson, ``Discrete versions of the transport equation and the S hepp-- O lkin conjecture,'' Ann. Probab. 44 (1) (2016), 276--306. https://doi.org/10.1214/14-AOP973

work page doi:10.1214/14-aop973 2016
[4]

Hillion and O

E. Hillion and O. T. Johnson, ``A proof of the S hepp-- O lkin entropy concavity conjecture,'' Bernoulli 23 (4B) (2017), 3638--3649. https://doi.org/10.3150/16-BEJ860

work page doi:10.3150/16-bej860 2017
[5]

Hillion and O

E. Hillion and O. T. Johnson, ``A proof of the S hepp-- O lkin entropy monotonicity conjecture,'' Electron. J. Probab. 24 (2019), article no. 126. https://doi.org/10.1214/19-EJP380

work page doi:10.1214/19-ejp380 2019
[6]

E. T. Jaynes, ``Information theory and statistical mechanics,'' Phys. Rev. 106(4) (1957), 620--630. https://doi.org/10.1103/PhysRev.106.620

work page doi:10.1103/physrev.106.620 1957
[7]

O. T. Johnson, ``Log-concavity and the maximum entropy property of the P oisson distribution,'' Stochastic Process. Appl. 117 (6) (2007), 791--802. https://doi.org/10.1016/j.spa.2006.10.006

work page doi:10.1016/j.spa.2006.10.006 2007
[8]

Johnson, I

O. Johnson, I. Kontoyiannis, and M. Madiman, ``Log-concavity, ultra-log-concavity, and a maximum entropy property of discrete compound Poisson measures,'' Discrete Applied Math. 161 (9) (2013), 1232--1250. https://doi.org/10.1016/j.dam.2011.08.025

work page doi:10.1016/j.dam.2011.08.025 2013
[9]

Kova cevi\'c, ``On the maximum entropy of a sum of independent discrete random variables,'' Theory Probab

M. Kova cevi\'c, ``On the maximum entropy of a sum of independent discrete random variables,'' Theory Probab. Appl. 66 (3) (2021), 482--487. https://doi.org/10.1137/S0040585X97T99054X

work page doi:10.1137/s0040585x97t99054x 2021
[10]

Mateev, ``The entropy of the multinomial distribution,'' Theory Probab

P. Mateev, ``The entropy of the multinomial distribution,'' Theory Probab. Appl. 23 (1) (1978), 188--190. https://doi.org/10.1137/1123020

work page doi:10.1137/1123020 1978
[11]

Ordentlich, ``Maximizing the entropy of a sum of independent bounded random variables,'' IEEE Trans

E. Ordentlich, ``Maximizing the entropy of a sum of independent bounded random variables,'' IEEE Trans. Inform. Theory 52 (5) (2006), 2176--2181. https://doi.org/10.1109/TIT.2006.872858

work page doi:10.1109/tit.2006.872858 2006
[12]

Q. I. Rahman and G. Schmeisser, Analytic Theory of Polynomials, Oxford University Press, 2002

work page 2002
[13]

L. A. Shepp and I. Olkin, ``Entropy of the sum of independent B ernoulli random variables and of the multinomial distribution,'' Tech. Report 131, Stanford University, 1978. Reprinted in Contributions to probability, Academic Press, New York, 1981, pp. 201--206. https://doi.org/10.1016/B978-0-12-274460-0.50022-9

work page doi:10.1016/b978-0-12-274460-0.50022-9 1978
[14]

R. P. Stanley, ``Log-concave and unimodal sequences in algebra, combinatorics, and geometry,'' Ann. New York Acad. Sci. 576 (1989), 500--535. https://doi.org/10.1111/j.1749-6632.1989.tb16434.x

work page doi:10.1111/j.1749-6632.1989.tb16434.x 1989
[15]

Yu, ``Maximum entropy for sums of symmetric and bounded random variables: a short derivation,'' IEEE Trans

Y. Yu, ``Maximum entropy for sums of symmetric and bounded random variables: a short derivation,'' IEEE Trans. Inform. Theory 54 (4) (2008), 1818--1819. https://doi.org/10.1109/TIT.2008.917660

work page doi:10.1109/tit.2008.917660 2008
[16]

Yu, ``On the maximum entropy properties of the binomial distribution,'' IEEE Trans

Y. Yu, ``On the maximum entropy properties of the binomial distribution,'' IEEE Trans. Inform. Theory 54 (7) (2008), 3351--3353. https://doi.org/10.1109/TIT.2008.924715

work page doi:10.1109/tit.2008.924715 2008