arxiv: 2605.08967 · v1 · submitted 2026-05-09 · ❄️ cond-mat.stat-mech · math-ph· math.MP· q-bio.QM

Recognition: 2 theorem links

· Lean Theorem

Condensation Transition in Entropy-Constrained Probability Spaces

Bautista Arenaza, Dami\'an G. Hern\'andez, In\'es Samengo, Sebasti\'an Risau-Gusman

Pith reviewed 2026-05-12 02:34 UTC · model grok-4.3

classification ❄️ cond-mat.stat-mech math-phmath.MPq-bio.QM

keywords condensation transitionprobability simplexShannon entropyphase transitionthermodynamic limitstatistical mechanicssparsity

0 comments

The pith

Below a critical entropy most probability distributions on the simplex enter a condensed state with one dominant component.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines distributions populating fixed-entropy level surfaces inside the probability simplex. It introduces a discretization that treats distinct microstate distributions with equal weight, allowing a combinatorial count in the large-K limit. The central result is a condensation transition at H_c that scales as log K minus one plus gamma. Below this value the typical distribution has one component carrying a macroscopic probability fraction and the rest forming a uniform background. This supplies a purely entropic route to sparsity that could underlie overconfident machine-learning outputs and the appearance of dominant species.

Core claim

A condensation phase transition is shown to take place below a critical entropy that scales as H_c ≃ log K - 1 + γ in the thermodynamic limit. For entropy values H_0 < H_c, the overwhelming majority of distributions are found in a condensed state, in which a single component captures a macroscopic fraction of the total probability mass while the remaining components form a homogeneous fluid background.

What carries the argument

The discretization strategy that assigns equal statistical weight to distinct microstate distributions, which permits a combinatorial analysis of the simplex in the thermodynamic limit.

If this is right

For entropies below the critical value the overwhelming majority of distributions are condensed.
The same entropic constraint supplies a mechanism for overconfident predictions in machine learning.
Dominant species can emerge in ecological models solely from entropy limits on abundance distributions.
Sparsity arises naturally in high-dimensional manifolds when only entropy is constrained.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Entropy-constrained training objectives in machine learning could produce sparse representations without explicit L1 penalties.
Species-abundance histograms in ecology should be examined for condensation signatures once Shannon diversity falls below the critical scaling.
Monte Carlo sampling of finite-K simplices can be used to locate the sharpening of the transition and test the predicted offset by Euler's constant.

Load-bearing premise

The discretization strategy that assigns equal statistical weight to distinct microstate distributions enables a combinatorial analysis of the simplex in the thermodynamic limit.

What would settle it

Generate or sample many discrete distributions on the K-simplex at fixed entropy H_0 for successively larger K, then measure the fraction in which the largest component exceeds a macroscopic threshold such as 1/sqrt(K); the jump to near-unity should occur near the predicted H_c scaling.

Figures

Figures reproduced from arXiv: 2605.08967 by Bautista Arenaza, Dami\'an G. Hern\'andez, In\'es Samengo, Sebasti\'an Risau-Gusman.

**Figure 2.** Figure 2: FIG. 2. Marginal probability densities [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3. Evidence of condensation for sub-manifolds of con [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

read the original abstract

The organization of high-dimensional probability spaces is a fundamental problem at the intersection of statistical physics and information theory. Here, we analyze the distributions populating level surfaces of the probability simplex $\Delta_{K-1}$ defined by a fixed Shannon entropy. We introduce a discretization strategy that assigns equal statistical weight to distinct microstate distributions and enables a combinatorial analysis of the simplex. A condensation phase transition is shown to take place below a critical entropy that scales as $H_c \simeq \log K - 1 + \gamma$ in the thermodynamic limit. For entropy values $H_0 < H_c$, the overwhelming majority of distributions are found in a condensed state, in which a single component captures a macroscopic fraction of the total probability mass while the remaining components form a homogeneous fluid background. These results provide a framework for understanding phenomena such as overconfident predictions in machine learning and the emergence of dominant species in ecology, and suggest that sparsity can arise naturally from entropic constraints in high-dimensional manifolds.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper uses discretization to count fixed-entropy distributions on the simplex and reports a condensation transition at H_c ~ log K -1 + γ, but the counting measure may not match the continuous geometry.

read the letter

The main point is that they discretize the K-simplex, give equal weight to each microstate distribution, and then combinatorially show that below a critical entropy the typical distribution condenses: one component grabs a macroscopic probability share while the rest stay roughly uniform. The scaling H_c ≃ log K -1 + γ is the concrete new claim, and it is not the usual textbook picture of the simplex.

Referee Report

2 major / 2 minor

Summary. The manuscript analyzes distributions on the probability simplex Δ_{K-1} constrained to fixed Shannon entropy H_0. It introduces a discretization that assigns equal statistical weight to distinct microstate distributions, enabling a combinatorial enumeration in the thermodynamic limit K→∞. The central claim is a condensation phase transition at critical entropy H_c ≃ log K − 1 + γ; for H_0 < H_c the overwhelming majority of distributions are condensed, with one component carrying a macroscopic probability mass and the remainder a homogeneous fluid background. Applications to machine-learning overconfidence and ecological dominance are suggested.

Significance. If the combinatorial counting is shown to be asymptotically equivalent to the natural geometric measure on the entropy level set, the result supplies a statistical-mechanics explanation for the spontaneous emergence of sparsity under pure entropic constraints in high-dimensional manifolds. The explicit scaling H_c ≃ log K − 1 + γ and the identification of the condensed phase are potentially useful for both theoretical and applied work in information theory and statistical physics.

major comments (2)

[§3] §3 (Discretization and measure): The combinatorial analysis assigns uniform weight to each distinct discretized microstate. The manuscript must explicitly demonstrate that this counting measure converges to the (K−2)-dimensional Hausdorff measure on the entropy-constrained surface as the discretization scale →0 and K→∞; otherwise the statement that condensed configurations constitute the “overwhelming majority” does not necessarily transfer from the discrete to the continuous setting.
[§4.2] §4.2 (Thermodynamic-limit derivation): The scaling H_c ≃ log K − 1 + γ is obtained from a saddle-point or large-deviation analysis of the number of microstates. The derivation should state the precise asymptotic regime (discretization bin size relative to 1/K) under which the entropy of the counting measure remains sub-extensive; without this control the transition location may shift or disappear.

minor comments (2)

[§2] The notation for the discretized simplex and the binning procedure should be introduced with an explicit figure or equation showing how the continuous simplex is partitioned.
[Introduction] A brief comparison with the uniform (Lebesgue) measure on the simplex, or with the Dirichlet distribution at fixed entropy, would clarify the novelty of the chosen counting measure.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments, which have helped us clarify key aspects of the discretization and asymptotic analysis. We address each major comment below and have revised the manuscript to incorporate additional justification and explicit statements of the relevant regimes.

read point-by-point responses

Referee: [§3] §3 (Discretization and measure): The combinatorial analysis assigns uniform weight to each distinct discretized microstate. The manuscript must explicitly demonstrate that this counting measure converges to the (K−2)-dimensional Hausdorff measure on the entropy-constrained surface as the discretization scale →0 and K→∞; otherwise the statement that condensed configurations constitute the “overwhelming majority” does not necessarily transfer from the discrete to the continuous setting.

Authors: We agree that a clear link between the discrete counting measure and the continuous Hausdorff measure strengthens the interpretation. In the revised manuscript we have added a dedicated paragraph in §3 explaining the convergence. As the bin size δ satisfies δ→0 with Kδ→∞, each small volume element on the simplex contains a number of microstates proportional to its (K−2)-dimensional volume; the entropy constraint then selects a hypersurface whose measure is faithfully represented by the discrete count in the large-K limit. Standard results from geometric measure theory and uniform sampling on the simplex support that the “overwhelming majority” statement carries over. We view this as sufficient for the present work while noting that a fully rigorous ε-δ proof lies outside the paper’s scope. revision: yes
Referee: [§4.2] §4.2 (Thermodynamic-limit derivation): The scaling H_c ≃ log K − 1 + γ is obtained from a saddle-point or large-deviation analysis of the number of microstates. The derivation should state the precise asymptotic regime (discretization bin size relative to 1/K) under which the entropy of the counting measure remains sub-extensive; without this control the transition location may shift or disappear.

Authors: We thank the referee for emphasizing the need to control the discretization scale. The revised §4.2 now explicitly states the regime: we take the number of bins per coordinate M=1/δ to satisfy log M = o(K), or equivalently δ ≫ exp(−cK) for any c>0 but still δ→0. Under this condition the combinatorial entropy contributed by the counting measure is K log M = o(K) and therefore sub-extensive relative to the leading entropy terms of order K. The saddle-point equation that locates H_c is consequently unaffected at leading order, preserving the scaling H_c ≃ log K − 1 + γ. A short paragraph has been inserted to document this control and to confirm that the condensation transition persists. revision: yes

Circularity Check

0 steps flagged

No circularity: combinatorial derivation from explicit discretization choice

full rationale

The paper introduces a discretization strategy that assigns equal weight to microstate distributions as an explicit modeling assumption, then performs combinatorial enumeration on the resulting discrete simplex to locate the condensation transition and extract the scaling H_c ≃ log K − 1 + γ. No step reduces a claimed prediction to a fitted parameter by construction, no load-bearing self-citation chain is invoked, and the central result is obtained by direct counting under the chosen measure rather than by re-labeling an input. The derivation is therefore self-contained against the paper’s own premises.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract-only review supplies insufficient technical detail to enumerate free parameters, axioms, or invented entities with precision. The thermodynamic limit and the equal-weight discretization are the two load-bearing premises visible in the text.

axioms (2)

domain assumption Discretization strategy assigns equal statistical weight to distinct microstate distributions
This premise is introduced to convert the continuous simplex into a combinatorially tractable set.
standard math Thermodynamic limit K → ∞ is taken to extract the scaling H_c ≃ log K - 1 + γ
The critical entropy scaling is stated only in this limit.

pith-pipeline@v0.9.0 · 5497 in / 1425 out tokens · 68490 ms · 2026-05-12T02:34:46.357739+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
We introduce a discretization strategy that assigns equal statistical weight to distinct microstate distributions and enables a combinatorial analysis of the simplex... p(n*)∝K! / ∏ n^{m_n} n! ... maximizing the entropy of the multiplicity distribution H(f)
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean alpha_pin_under_high_calibration unclear
H_c ≃ log K −1 +γ ... condensation when β<0 ... single component captures macroscopic fraction

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

[1]

Forβ >0, config- urations with localized probability mass are disfavored, leading to more homogeneous distributions and higher entropy

and controls the homogeneity of the componentsπ i of theK-dimensional probability vector. Forβ >0, config- urations with localized probability mass are disfavored, leading to more homogeneous distributions and higher entropy. Conversely,β <0 favors sparse, localized con- figurations with lower entropy. For a fixedK, the mean occupancy and entropy constrai...

work page
[2]

The solid lines represent the medians, and the shaded areas capture 99% of the sampled values

(orange), remains negligible across the entire regime. The solid lines represent the medians, and the shaded areas capture 99% of the sampled values. By applying an asymptotic approximation to evaluate all the involved integrals in the largeKlimit, we derive an analytical expression for the condensate mass, ˆπ∗ 1 ≃1− H0 logK 1 1− 1 logK log H0 logK .(7) I...

work page
[3]

com- binatorial trap

behaves as a homogeneous, high-entropy fluid distributed uniformly among the remaining dimensions, implying that the system does not undergo a secondary condensation. A single condensate emerges because the system must maximize the entropy of the multiplicities. Suppose that a configuration attempts to satisfy the entropy constraint (forH 0 < H c) by dist...

work page
[4]

Amari and H

S.-i. Amari and H. Nagaoka,Methods of information ge- ometry, Vol. 191 (American Mathematical Soc., 2000)

work page 2000
[5]

E. T. Jaynes, Information theory and statistical mechan- ics, Physical review106, 620 (1957). 5

work page 1957
[6]

Zdeborov´ a and F

L. Zdeborov´ a and F. Krzakala, Statistical physics of in- ference: Thresholds and algorithms, Advances in Physics 65, 453 (2016)

work page 2016
[7]

Talagrand, Concentration of measure and isoperi- metric inequalities in product spaces, Publications Math´ ematiques de l’Institut des Hautes Etudes Scien- tifiques81, 73 (1995)

M. Talagrand, Concentration of measure and isoperi- metric inequalities in product spaces, Publications Math´ ematiques de l’Institut des Hautes Etudes Scien- tifiques81, 73 (1995)

work page 1995
[8]

Raginsky, I

M. Raginsky, I. Sason,et al., Concentration of measure inequalities in information theory, communications, and coding, Foundations and Trends in Communications and Information Theory10, 1 (2013)

work page 2013
[9]

C. E. Shannon, A mathematical theory of communica- tion, The Bell system technical journal27, 379 (1948)

work page 1948
[10]

T. M. Cover,Elements of information theory(John Wi- ley & Sons, 1999)

work page 1999
[11]

Bialas, Z

P. Bialas, Z. Burda, and D. Johnston, Condensation in the backgammon model, Nuclear Physics B493, 505 (1997)

work page 1997
[12]

M. R. Evans and T. Hanney, Nonequilibrium statistical mechanics of the zero-range process and related mod- els, Journal of Physics A: Mathematical and General38, R195 (2005)

work page 2005
[13]

S. N. Majumdar, M. Evans, and R. K. Zia, Nature of the condensate in mass transport models, Physical review letters94, 180601 (2005)

work page 2005
[14]

Godreche and J

C. Godreche and J. Luck, Dynamics of the condensate in zero-range processes, Journal of Physics A: Mathematical and General38, 7215 (2005)

work page 2005
[15]

Bouchaud and M

J.-P. Bouchaud and M. M´ ezard, Wealth condensation in a simple model of economy, Physica A: Statistical Me- chanics and its Applications282, 536 (2000)

work page 2000
[16]

Dragulescu and V

A. Dragulescu and V. M. Yakovenko, Statistical me- chanics of money, The European Physical Journal B- Condensed Matter and Complex Systems17, 723 (2000)

work page 2000
[17]

Biswas, Entropy geometry and condensation in wealth allocation, arXiv preprint arXiv:2602.03676 (2026)

K. Biswas, Entropy geometry and condensation in wealth allocation, arXiv preprint arXiv:2602.03676 (2026)

work page arXiv 2026
[18]

L. D. Landau and E. M. Lifshitz,Statistical physics: vol- ume 5, Vol. 5 (Elsevier, 2013)

work page 2013
[19]

E. T. Jaynes, On the rationale of maximum-entropy methods, Proceedings of the IEEE70, 939 (1982)

work page 1982
[20]

K. W. Ng, G.-L. Tian, and M.-L. Tang,Dirichlet and related distributions: Theory, methods and applications (John Wiley & Sons, 2011)

work page 2011
[21]

Arenaza, M

B. Arenaza, M. Onetto, S. Risau-Gusman, I. Samengo, and D. G. Hern´ andez, The geometry of entropy in high dimensional probability spaces, In preparation (2026)

work page 2026
[22]

Bialas, Z

P. Bialas, Z. Burda, and D. A. Johnston, Random alloca- tion models in the thermodynamic limit, Physical Review E108, 064107 (2023)

work page 2023
[23]

C. M. Bishop and N. M. Nasrabadi,Pattern recognition and machine learning, Vol. 4 (Springer, 2006)

work page 2006
[24]

Grandvalet and Y

Y. Grandvalet and Y. Bengio, Entropy regularization. (2006)

work page 2006
[25]

X.-Z. Wang, R. Wang, and C. Xu, Discovering the rela- tionship between generalization and uncertainty by incor- porating complexity of classification, IEEE transactions on cybernetics48, 703 (2017)

work page 2017
[26]

Regularizing neural networks by penalizing confident output distributions

G. Pereyra, G. Tucker, J. Chorowski, L. Kaiser, and G. Hinton, Regularizing neural networks by penal- izing confident output distributions, arXiv preprint arXiv:1701.06548 (2017)

work page arXiv 2017
[27]

Meister, E

C. Meister, E. Salesky, and R. Cotterell, Generalized en- tropy regularization or: There’s nothing special about la- bel smoothing, inProceedings of the 58th Annual Meeting of the Association for Computational Linguistics(2020) pp. 6870–6886

work page 2020
[28]

Hughes, Theories and models of species abundance, The American Naturalist128, 879 (1986)

R. Hughes, Theories and models of species abundance, The American Naturalist128, 879 (1986)

work page 1986
[29]

Bianconi, L

G. Bianconi, L. Ferretti, and S. Franz, Non-neutral the- ory of biodiversity, EPL (Europhysics Letters)87, 28001 (2009)

work page 2009
[30]

R. J. Cubero, M. Marsili, and Y. Roudi, Minimum description length codes are critical, Entropy20, 755 (2018)

work page 2018
[31]

Marsili and Y

M. Marsili and Y. Roudi, Quantifying relevance in learn- ing and inference, Physics Reports963, 1 (2022)

work page 2022