arxiv: 2605.07624 · v3 · submitted 2026-05-08 · 💻 cs.IT · math.IT

Recognition: 2 theorem links

· Lean Theorem

Kolmogorov--Nagumo Mean Frameworks for Conditional Entropy

Akira Kamatsuka , Takahiro Yoshida

Authors on Pith no claims yet

Pith reviewed 2026-05-13 07:42 UTC · model grok-4.3

classification 💻 cs.IT math.IT

keywords conditional entropyKolmogorov-Nagumo meanη-averagingAugustin-Csiszár conditional entropygeneralized g-conditional entropydata processing inequalityconditioning reduces entropy

0 comments

The pith

A Kolmogorov-Nagumo mean framework represents conditional entropies that η-averaging cannot.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops frameworks for conditional entropy using Kolmogorov-Nagumo means. It first shows that an extended averaging method called (η, ψ)-KN averaging matches the standard η-averaging under certain conditions on concavity. It then introduces a new approach for generalized g-conditional entropies that goes beyond what η-averaging can do. Specifically, it proves that for some values of α and some joint distributions, the Augustin-Csiszár conditional entropy cannot be expressed using any (η, F)-entropy with the standard averaging, but fits in the new framework. The work also gives conditions under which these new entropies obey the rule that conditioning reduces entropy and the data processing inequality.

Core claim

The paper establishes a new framework for generalized g-conditional entropies based on Kolmogorov-Nagumo means. It demonstrates that this framework can represent the Augustin-Csiszár conditional entropy H_α^C(X|Y) in cases where no (η,F)-entropy under the η-averaging framework can, for certain α and joint distributions p_{X,Y}.

What carries the argument

(η, ψ)-KN averaging, which generalizes η-averaging using Kolmogorov-Nagumo means to support broader classes of conditional entropy measures.

Load-bearing premise

The new representations and equivalences depend on the concavification conditions for the averaging and the sufficient conditions for the entropy properties to hold.

What would settle it

Compute H_α^C(X|Y) for a chosen α and p_{X,Y} and check if it equals the value from any (η,F)-entropy under EAVG; if it does not match but fits the new framework, this supports the claim.

read the original abstract

This study focuses on conditional entropy frameworks based on the Kolmogorov--Nagumo (KN) mean. First, $(\eta, \psi)$-KN averaging (\texttt{EPKNAVG}), a KN-mean extension of the $\eta$-averaging (\texttt{EAVG}) framework for $(\eta, F)$-entropies, is introduced and proven to be equivalent to \texttt{EAVG} under suitable concavification conditions. Second, motivated by generalized $g$-vulnerability, a new framework is proposed for generalized $g$-conditional entropies. This framework captures conditional entropies beyond the scope of \texttt{EAVG}-type representations. In particular, it is shown that there exists an $\alpha$ and a joint probability distribution $p_{X, Y}$ such that the Augustin--Csisz{\' a}r conditional entropy $H_{\alpha}^{\mathrm{C}}(X|Y)$ cannot be represented by any $(\eta,F)$-entropy satisfying \texttt{EAVG}. In contrast, it is represented within the proposed framework. Furthermore, sufficient conditions are derived under which the proposed generalized $g$-conditional entropies satisfy the conditioning reduces entropy property and the data-processing inequality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines a new KN-mean averaging operation that brings Augustin-Csiszár conditional entropy inside a generalized g-framework where it sits outside prior EAVG representations.

read the letter

The paper's main advance is the introduction of (η, ψ)-KN averaging as an extension of the earlier η-averaging for (η, F)-entropies. They prove equivalence under concavification conditions and then build a generalized g-conditional entropy framework around it. The concrete result is that there exists an α and a joint distribution where H_α^C(X|Y) cannot be written as any EAVG-type (η, F)-entropy, yet it is captured by the new setup. They also supply sufficient conditions under which the generalized entropies satisfy conditioning reduces entropy and the data-processing inequality. This is useful technical work for people who need conditional entropy measures that go beyond standard averaging forms, especially in privacy or security contexts. The example with Augustin-Csiszár is specific and shows the framework actually reaches something the old one does not. The equivalence proof and the property derivations are the parts that make the framework usable rather than just definitional. The soft spot is the non-representability claim. It rests on showing that no admissible F can reproduce the particular nonlinear dependence in H_α^C. If the argument only rules out common classes of F instead of deriving a contradiction that holds for arbitrary measurable F, the universal statement that it lies strictly outside EAVG would be weaker than presented. That section needs careful verification. This is for specialists in information theory who work on axiomatic foundations or generalized entropies. A reader already familiar with EAVG and g-vulnerability will get the most out of the new representation and the example. It deserves peer review because the claims are precise, the example is checkable, and the added properties are stated with conditions rather than asserted in full generality.

Referee Report

2 major / 1 minor

Summary. The paper introduces (η, ψ)-KN averaging (EPKNAVG) as a Kolmogorov-Nagumo mean extension of the η-averaging (EAVG) framework for (η, F)-entropies and proves equivalence under suitable concavification conditions. It proposes a new framework for generalized g-conditional entropies motivated by g-vulnerability, shows that there exist α and p_{X,Y} such that the Augustin-Csiszár conditional entropy H_α^C(X|Y) cannot be represented by any (η,F)-entropy under EAVG but is captured in the new framework, and derives sufficient conditions for the new entropies to satisfy conditioning reduces entropy and the data-processing inequality.

Significance. If the non-representability result and the representation in the new framework hold with full rigor, the work meaningfully enlarges the class of conditional entropies that admit axiomatic or averaging-based characterizations, with direct relevance to generalized vulnerability measures. The equivalence theorem between the two averaging schemes is a useful technical contribution that clarifies the relationship between existing and extended frameworks.

major comments (2)

[Abstract] The headline non-representability claim (that H_α^C(X|Y) lies outside every EAVG (η,F)-entropy for some α and p_{X,Y}) requires an argument that rules out arbitrary measurable F, not merely a subclass such as concave or power functions. If the proof proceeds by checking only restricted families of F, the universal quantifier fails and the subsequent claim that the new (η,ψ)-KN framework is strictly more expressive rests on an incomplete exclusion.
The sufficient conditions under which the generalized g-conditional entropies satisfy conditioning reduces entropy and the data-processing inequality are stated only at the level of the abstract; the precise restrictions on g, η, and ψ that make these properties hold must be exhibited explicitly (ideally with a counter-example when the conditions are violated) so that readers can assess their scope.

minor comments (1)

Notation for the new averaging operator (EPKNAVG) and the generalized g-conditional entropy should be introduced with a clear table or diagram contrasting it with EAVG to reduce reader confusion.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We appreciate the acknowledgment of the potential significance of the non-representability result and the technical value of the equivalence theorem. We address the two major comments below and will revise the manuscript accordingly to strengthen rigor and clarity.

read point-by-point responses

Referee: [Abstract] The headline non-representability claim (that H_α^C(X|Y) lies outside every EAVG (η,F)-entropy for some α and p_{X,Y}) requires an argument that rules out arbitrary measurable F, not merely a subclass such as concave or power functions. If the proof proceeds by checking only restricted families of F, the universal quantifier fails and the subsequent claim that the new (η,ψ)-KN framework is strictly more expressive rests on an incomplete exclusion.

Authors: We acknowledge the need for explicit generality in the non-representability argument. The proof in the manuscript is constructed using the axiomatic properties of (η,F)-entropies under EAVG and the specific functional form of H_α^C, which applies to arbitrary measurable F (not restricted to concave or power cases). To remove any possible ambiguity regarding the universal quantifier, we will revise the relevant theorem and its proof to include an explicit step-by-step exclusion that covers general measurable F, thereby confirming that the (η,ψ)-KN framework is strictly more expressive. revision: yes
Referee: [—] The sufficient conditions under which the generalized g-conditional entropies satisfy conditioning reduces entropy and the data-processing inequality are stated only at the level of the abstract; the precise restrictions on g, η, and ψ that make these properties hold must be exhibited explicitly (ideally with a counter-example when the conditions are violated) so that readers can assess their scope.

Authors: We agree that the sufficient conditions should be stated explicitly in the main text. In the revised manuscript we will add a dedicated subsection (or expand the relevant theorem statement) that precisely specifies the restrictions on g, η, and ψ under which conditioning reduces entropy and the data-processing inequality hold. We will also include counter-examples showing failure of the properties when the conditions are violated, allowing readers to assess the full scope of the results. revision: yes

Circularity Check

0 steps flagged

No circularity: frameworks introduced by definition and properties derived independently

full rationale

The paper defines (η, ψ)-KN averaging explicitly as an extension of EAVG and proves equivalence under concavification conditions. It then defines a new generalized g-conditional entropy framework and exhibits a specific α and joint distribution where H_α^C(X|Y) lies outside EAVG representations but inside the new one. These steps are constructive definitions followed by direct verification of axioms (conditioning reduces entropy, DPI) under stated sufficient conditions. No load-bearing step reduces to a fitted parameter renamed as prediction, no self-citation chain supplies the central uniqueness or non-representability result, and no ansatz is smuggled via prior work. The non-representability claim is an existence statement for one counter-example pair, not an exhaustive search over all F that would require circular justification. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities. The work relies on standard information-theoretic assumptions such as concavity of functions and properties of joint probability distributions.

pith-pipeline@v0.9.0 · 5502 in / 1059 out tokens · 57622 ms · 2026-05-13T07:42:17.017120+00:00 · methodology

Review history (3 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 1: EPKNAVG equivalent to EAVG under ψ∘F concave/convex; Proposition 5: existence of α,p_{X,Y} where H_α^C(X|Y) not representable by any (η,F)-EAVG
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean costAlphaLog_high_calibrated_iff unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Definition 11 (EPKNAVG) and Lemma 1 using concavification of F by ψ

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

[1]

On measures of entropy and information,

A. R ´enyi, “On measures of entropy and information,” inProceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probabil- ity, Volume 1: Contributions to the Theory of Statistics, vol. 4. University of California Press, 1961, pp. 547–562

work page 1961
[2]

Quantification method of classification processes. concept of structural a-entropy,

J. Havrda and F. Charv ´at, “Quantification method of classification processes. concept of structural a-entropy,”Kybernetika, vol. 3, pp. 30– 35, 1967. [Online]. Available: https://api.semanticscholar.org/CorpusID: 14939337

work page 1967
[3]

Possible generalization of Boltzmann-Gibbs statistics,

C. Tsallis, “Possible generalization of Boltzmann-Gibbs statistics,” Journal of Statistical Physics, vol. 52, no. 1, pp. 479–487, 1988. [Online]. Available: https://doi.org/10.1007/BF01016429

work page doi:10.1007/bf01016429 1988
[4]

New non-additive measures of entropy for discrete probability distributions,

B. D. Sharma and D. P. Mittal, “New non-additive measures of entropy for discrete probability distributions,”J. Math. Sci, vol. 10, no. 75, pp. 28–40, 1975

work page 1975
[5]

Information measures and capacity of orderαfor discrete memoryless channels,

S. Arimoto, “Information measures and capacity of orderαfor discrete memoryless channels,” in2nd Colloquium, Keszthely, Hungary, 1975, I. Csiszar and P. Elias, Eds., vol. 16. Amsterdam, Netherlands: North Holland: Colloquia Mathematica Societatis Jano’s Bolyai, 1977, pp. 41– 52

work page 1975
[6]

Exponential decreasing rate of leaked information in uni- versal random privacy amplification,

M. Hayashi, “Exponential decreasing rate of leaked information in uni- versal random privacy amplification,”IEEE Transactions on Information Theory, vol. 57, no. 6, pp. 3989–4001, 2011

work page 2011
[7]

Conditional R ´enyi entropies,

A. Teixeira, A. Matos, and L. Antunes, “Conditional R ´enyi entropies,” IEEE Transactions on Information Theory, vol. 58, no. 7, pp. 4273–4277, 2012

work page 2012
[8]

Information theoretic security for encryption based on conditional R ´enyi entropies,

M. Iwamoto and J. Shikata, “Information theoretic security for encryption based on conditional R ´enyi entropies,” inInformation Theoretic Security, C. Padr ´o, Ed. Cham: Springer International Publishing, 2014, pp. 103– 121

work page 2014
[9]

On the conditional R ´enyi entropy,

S. Fehr and S. Berens, “On the conditional R ´enyi entropy,”IEEE Transactions on Information Theory, vol. 60, no. 11, pp. 6801–6810, 2014

work page 2014
[10]

Arimoto-R´enyi conditional entropy and bayesian hypothesis testing,

I. Sason and S. Verd ´u, “Arimoto-R´enyi conditional entropy and bayesian hypothesis testing,” in2017 IEEE International Symposium on Informa- tion Theory (ISIT), 2017, pp. 2965–2969

work page 2017
[11]

On a general definition of conditional R ´enyi entropies,

V . M. Ili ´c, I. B. Djordjevi ´c, and M. Stankovi ´c, “On a general definition of conditional R ´enyi entropies,”Proceedings, vol. 2, no. 4, 2018. [Online]. Available: https://www.mdpi.com/2504-3900/2/4/166

work page 2018
[12]

Conditional R ´enyi entropy and the relationships between R ´enyi capacities,

G. Aishwarya and M. Madiman, “Conditional R ´enyi entropy and the relationships between R ´enyi capacities,”Entropy, vol. 22, no. 5, 2020. [Online]. Available: https://www.mdpi.com/1099-4300/22/5/526

work page 2020
[13]

Information theoretical properties of Tsallis entropies,

S. Furuichi, “Information theoretical properties of Tsallis entropies,” Journal of Mathematical Physics, vol. 47, no. 2, p. 023302, 02 2006. [Online]. Available: https://doi.org/10.1063/1.2165744

work page doi:10.1063/1.2165744 2006
[14]

Conditional Tsallis entropy,

S. T. Manije, M. B. Gholamreza, and A. Mohammad, “Conditional Tsallis entropy,”Cybernetics and Information Technologies, vol. 13, no. 2, pp. 37–42, 2013. [Online]. Available: https://doi.org/10.2478/cait-2013-0012

work page doi:10.2478/cait-2013-0012 2013
[15]

On conditional Tsallis entropy,

A. Teixeira, A. Souto, and L. Antunes, “On conditional Tsallis entropy,”Entropy, vol. 23, no. 11, 2021. [Online]. Available: https://www.mdpi.com/1099-4300/23/11/1427

work page 2021
[16]

Game theory, maximum entropy, minimum discrepancy and robust bayesian decision theory,

P. D. Gr ¨unwald and A. P. Dawid, “Game theory, maximum entropy, minimum discrepancy and robust bayesian decision theory,”Ann. Statist., vol. 32, no. 4, pp. 1367–1433, 08 2004

work page 2004
[17]

Strictly proper scoring rules, prediction, and estimation,

T. Gneiting and A. E. Raftery, “Strictly proper scoring rules, prediction, and estimation,”Journal of the American Statistical Association, vol. 102, no. 477, pp. 359–378, 2007

work page 2007
[18]

The geometry of proper scoring rules,

A. P. Dawid, “The geometry of proper scoring rules,”Annals of the Institute of Statistical Mathematics, vol. 59, no. 1, pp. 77–93, 2007. [Online]. Available: https://doi.org/10.1007/s10463-006-0099-8

work page doi:10.1007/s10463-006-0099-8 2007
[19]

Universal prediction,

N. Merhav and M. Feder, “Universal prediction,”IEEE Transactions on Information Theory, vol. 44, no. 6, pp. 2124–2147, 1998

work page 1998
[20]

Measur- ing information leakage using generalized gain functions,

M. S. Alvim, K. Chatzikokolakis, C. Palamidessi, and G. Smith, “Measur- ing information leakage using generalized gain functions,” in2012 IEEE 25th Computer Security Foundations Symposium, 2012, pp. 265–279

work page 2012
[21]

Additive and multiplicative notions of leakage, and their capacities,

M. S. Alvim, K. Chatzikokolakis, A. Mciver, C. Morgan, C. Palamidessi, and G. Smith, “Additive and multiplicative notions of leakage, and their capacities,” in2014 IEEE 27th Computer Security Foundations Symposium, 2014, pp. 308–322

work page 2014
[22]

Axioms for information leakage,

M. S. Alvim, K. Chatzikokolakis, A. McIver, C. Morgan, C. Palamidessi, and G. Smith, “Axioms for information leakage,” in2016 IEEE 29th Computer Security Foundations Symposium (CSF), 2016, pp. 77–92

work page 2016
[23]

An axiomatization of information flow measures,

——, “An axiomatization of information flow measures,”Theoretical Computer Science, vol. 777, pp. 32–54, 2019, in memory of Maurice Nivat, a founding father of Theoretical Computer Science - Part I

work page 2019
[24]

An Extension of the Adversarial Threat Model in Quantitative Information Flow ,

M. A. Zarrabian and P. Sadeghi, “ An Extension of the Adversarial Threat Model in Quantitative Information Flow ,” in2025 IEEE 38th Computer Security Foundations Symposium (CSF). Los Alamitos, CA, USA: IEEE Computer Society, Jun. 2025, pp. 554–569. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/CSF64896.2025.00036

work page doi:10.1109/csf64896.2025.00036 2025
[25]

On the notion of mean,

V . M. Tikhomirov, “On the notion of mean,”Selected Works of A. N. Kolmogorov, pp. 144–146, 1991

work page 1991
[26]

Several representations ofα-mutual information and interpretations as privacy leakage measures,

A. Kamatsuka and T. Yoshida, “Several representations ofα-mutual information and interpretations as privacy leakage measures,” in2025 IEEE International Symposium on Information Theory (ISIT), 2025, pp. 1–6

work page 2025
[27]

A generalized leakage interpretation of alpha-mutual information,

——, “A generalized leakage interpretation of alpha-mutual information,”

work page
[28]

Available: https://arxiv.org/abs/2601.09406

[Online]. Available: https://arxiv.org/abs/2601.09406

work page arXiv
[29]

Noisy channels,

U. Augustin, “Noisy channels,” Ph.D. dissertation, Habilitation thesis, Universit¨a Erlangen-N ¨urnberg, 1978

work page 1978
[30]

Generalized cutoff rates and R ´enyi’s information measures,

I. Csisz ´ar, “Generalized cutoff rates and R ´enyi’s information measures,” IEEE Transactions on Information Theory, vol. 41, no. 1, pp. 26–34, 1995

work page 1995
[31]

Conditional entropy and data processing: An axiomatic approach based on core-concavity,

A. Am ´erico, M. Khouzani, and P. Malacaria, “Conditional entropy and data processing: An axiomatic approach based on core-concavity,”IEEE Transactions on Information Theory, vol. 66, no. 9, pp. 5537–5547, 2020

work page 2020
[32]

Concavity, core-concavity, quasiconcavity: A generalizing framework for entropy measures,

A. Am ´erico and P. Malacaria, “Concavity, core-concavity, quasiconcavity: A generalizing framework for entropy measures,” in2021 IEEE 34th Computer Security Foundations Symposium (CSF), 2021, pp. 1–14

work page 2021
[33]

T. M. Cover and J. A. Thomas,Elements of Information Theory (Wiley Se- ries in Telecommunications and Signal Processing). Wiley-Interscience, 2006

work page 2006
[34]

Properties ofq-entropies,

G. A. Raggio, “Properties ofq-entropies,”Journal of Mathematical Physics, vol. 36, no. 9, pp. 4785–4791, 09 1995. [Online]. Available: https://doi.org/10.1063/1.530920

work page doi:10.1063/1.530920 1995
[35]

Renyi’s entropy and the probability of error,

M. Ben-Bassat and J. Raviv, “Renyi’s entropy and the probability of error,”IEEE Transactions on Information Theory, vol. 24, no. 3, pp. 324–331, 1978

work page 1978
[36]

Concavity and additivity in diversity measurement: Re- discovery of an unknown concept,

S. Hoffmann, “Concavity and additivity in diversity measurement: Re- discovery of an unknown concept,”Working Paper Series, 2006

work page 2006
[37]

Channel-supermodular entropies: Order theory and an application to query anonymization,

A. Am ´erico, M. Khouzani, and P. Malacaria, “Channel-supermodular entropies: Order theory and an application to query anonymization,” Entropy, vol. 24, no. 1, 2022. [Online]. Available: https://www.mdpi. com/1099-4300/24/1/39

work page 2022
[38]

Concavifying the quasiconcave,

C. Connell and E. Rasmusen, “Concavifying the quasiconcave,” Indiana University, Kelley School of Business, Department of Business Economics and Public Policy, Working Papers 2012-10, 2012. [Online]. Available: https://EconPapers.repec.org/RePEc:iuk:wpaper:2012-10

work page 2012