arxiv: 2605.11546 · v1 · submitted 2026-05-12 · 💻 cs.IT · math.IT

Recognition: no theorem link

The Entropy of Floating-Point Numbers

Anant Sahai, Michael R. DeWeese, Samuel H. D'Ambrosia, Sultan Daniels

Authors on Pith no claims yet

Pith reviewed 2026-05-13 01:23 UTC · model grok-4.3

classification 💻 cs.IT math.IT

keywords entropyfloating-point quantizationanalytic approximationscaling invariancediscrete entropyerror bounds

0 comments

The pith

An analytic approximation for floating-point entropy reveals a scaling-invariant link to the underlying continuous distribution.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an analytic approximation for the entropy of a random variable after it has been quantized to floating-point representation, together with explicit bounds on the approximation error. Unlike uniform quantization, where discrete entropy connects directly to differential entropy plus a constant, floating-point quantization connects through a different quantity that the approximation identifies. The work further proves that this floating-point entropy changes by only a negligible amount when the original continuous random variable is multiplied by any positive scaling factor. Closed-form expressions for the approximation are supplied for several standard distributions and checked against exact numerical computations.

Core claim

We present an analytic approximation for the entropy of floating-point numbers along with bounds on the error of this approximation. It is well-known that the differential entropy is tightly linked to the discrete entropy of a uniformly quantized random variable. Our approximation uncovers a different quantity that provides this link for floating-point quantization. Additionally, we prove that the entropy of a floating-point quantized random variable is approximately unchanged under scaling. Closed-form expressions for the floating-point entropy of common distributions are provided and compared to exact results.

What carries the argument

The analytic approximation formula that expresses floating-point entropy in terms of properties of the continuous distribution, together with the scaling-invariance proof that follows from it.

If this is right

Closed-form expressions become available for the entropy of common distributions under floating-point quantization.
The approximation error can be bounded explicitly once the distribution parameters and floating-point format are known.
Floating-point entropy remains nearly constant when the underlying variable is rescaled, so magnitude changes do not require re-deriving the entropy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Quantization-aware algorithms could treat scale changes as information-neutral operations within the same floating-point format.
The scaling result may simplify analysis of information loss when data passes through floating-point stages at different magnitudes.

Load-bearing premise

The approximation and its error bounds hold only when the continuous distribution and the floating-point format parameters satisfy certain unspecified conditions.

What would settle it

For a fixed distribution such as a standard normal, compute the exact floating-point entropy at two widely separated scales and check whether the difference stays inside the stated error bounds while the approximation itself remains accurate.

Figures

Figures reproduced from arXiv: 2605.11546 by Anant Sahai, Michael R. DeWeese, Samuel H. D'Ambrosia, Sultan Daniels.

**Figure 1.** Figure 1: Floating-point structure, and comparisons between analytic approximation and numerically evaluated exact discrete entropy. (1a) The structure of a floating-point number where each box represents one bit. The true bin size function ∆(x) is plotted on log-log scale for both midpoint (black solid curve) and floor quantization (blue dotted curve) with p = 10 and E = 4 along with the smooth approximation ∆s(x)… view at source ↗

**Figure 2.** Figure 2: Clipping and midpoint quantization with K = 3 representable values {u1, u2, u3}. The blue vertical lines represent the midpoints, and the arrows depict the regions of the real line that map to each representable value at a black vertical line. representation of X with an E-bit exponent and a (p − 1)-bit significand. The discrete entropy of Xfp is given by Theorem 4 where K = 2E+p , ui = ( u ′ i−2E+p−1 if i… view at source ↗

**Figure 3.** Figure 3: Exact midpoint-quantized entropy vs. standard deviation σ. For each precision p ∈ {1, . . . , 8}, the exact discrete entropy H(Xfp) of X ∼ N 0, σ2 (with µ = 0) is plotted as a function of σ over a wide log-scale range. Each curve corresponds to a distinct value of exponent bits E ∈ {0, 1, . . . , 7}. The vertical dashed lines mark σ = 2emin and the vertical dotted lines mark σ = 2emax for each E, and the… view at source ↗

read the original abstract

Here we present an analytic approximation for the entropy of floating-point numbers, along with bounds on the error of this approximation. It is well-known that the differential entropy is tightly linked to the discrete entropy of a uniformly quantized random variable. Our approximation uncovers a different quantity that provides this link for floating-point quantization. Additionally, we prove that the entropy of a floating-point quantized random variable is approximately unchanged under scaling. Closed-form expressions for the floating-point entropy of common distributions are provided and compared to exact results.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a workable analytic approximation for floating-point entropy plus a scale-invariance result that is new enough to be worth checking.

read the letter

The main point is an approximation that connects differential entropy to the discrete entropy after floating-point quantization, plus a proof that the quantized entropy is roughly unchanged when the variable is scaled. This differs from the standard uniform-quantization link and seems tailored to the variable bin widths in floating-point formats. The closed-form expressions for common distributions and the side-by-side comparisons with exact values are the most concrete parts; they let a reader see the size of the error in practice for Gaussians and similar cases. The error bounds are stated explicitly, which is helpful for anyone who might plug the formula into a larger model of computation or data representation. The scale-invariance claim follows from the way the mantissa and exponent interact, and the abstract indicates the derivation is direct rather than fitted after the fact. That is the useful new piece. The soft spots are modest but real. The approximation's accuracy depends on the density being smooth enough across bin boundaries and on the floating-point parameters (mantissa bits, exponent range) staying in a regime where tails do not dominate; the abstract does not spell out the precise conditions, so it is unclear how far the bounds extend to heavy-tailed or multimodal distributions. The comparisons to exact results are only shown for a few standard cases, so the practical error for other distributions or for very small or large scales is not yet demonstrated. No circularity or load-bearing fitting is visible, but a referee would want the full derivation steps to confirm the linking quantity is not just a restatement of existing quantization entropy work. This paper is for people who already work on quantization effects inside information theory or digital signal processing. A reader who needs entropy numbers for floating-point data or who models scale-dependent precision would get usable formulas and a clear invariance property. It is narrow but technically focused, so it deserves a serious referee who can check the derivation and ask for more validation on edge cases. I would send it to peer review rather than desk-reject.

Referee Report

0 major / 3 minor

Summary. The paper presents an analytic approximation, with error bounds, for the Shannon entropy of a continuous random variable after quantization to a floating-point representation. It identifies a distinct linking quantity (different from differential entropy) between this quantized entropy and the underlying continuous distribution. The authors prove that the resulting entropy is approximately invariant under scaling of the random variable and supply closed-form expressions for several common distributions (e.g., Gaussian), which are compared against exact numerical computations.

Significance. If the approximation and bounds hold under the stated conditions, the work supplies a practical bridge between differential and discrete entropy tailored to floating-point formats that dominate numerical computing. The approximate scale invariance is a notable distinction from uniform quantization results and could simplify entropy analysis in scaled or normalized data settings. The closed-form expressions together with direct comparisons to exact values add concrete utility and verifiability.

minor comments (3)

[Abstract] Abstract: the phrasing 'entropy of floating-point numbers' is slightly loose; it would be clearer to state 'entropy of a random variable after floating-point quantization' to avoid any implication that the entropy is a property of the format itself rather than the quantized variable.
[Numerical validation] The manuscript should explicitly list the floating-point parameters (mantissa bits, exponent bias/range) used for the closed-form comparisons in the numerical validation section so that readers can reproduce the exact results.
[Main derivation] Notation for the linking quantity introduced for floating-point quantization should be defined once in the main text and used consistently thereafter to prevent any ambiguity with standard differential entropy.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our work and the recommendation of minor revision. The referee's description accurately reflects the paper's contributions regarding the analytic approximation, error bounds, linking quantity, approximate scale invariance, and closed-form expressions for common distributions.

Circularity Check

0 steps flagged

No significant circularity; analytic derivation with external validation

full rationale

The paper presents an analytic approximation for floating-point entropy along with explicit error bounds, identifies a distinct linking quantity to differential entropy specific to FP quantization, proves approximate scale invariance of the quantized entropy, and supplies closed-form expressions for common distributions that are directly compared to exact results. These elements indicate a self-contained derivation chain that stands independently of its inputs and can be falsified or confirmed against external exact computations and stated conditions on the density and FP parameters. No load-bearing step reduces by construction to a fit, self-definition, or self-citation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides insufficient detail to enumerate free parameters, axioms, or invented entities; no specific equations or assumptions are stated.

pith-pipeline@v0.9.0 · 5377 in / 1004 out tokens · 40928 ms · 2026-05-13T01:23:24.668655+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages

[1]

J. M. Muller,Handbook of floating-point arithmetic / Jean-Michel Muller [and others].Boston: Birkhauser, 2010

work page 2010
[2]

What every computer scientist should know about floating-point arithmetic,

D. Goldberg, “What every computer scientist should know about floating-point arithmetic,”ACM Comput. Surv., vol. 23, no. 1, p. 5–48, Mar. 1991. [Online]. Available: https://doi.org/10.1145/103162.103163

work page doi:10.1145/103162.103163 1991
[3]

NVIDIA Blackwell Architecture Technical Overview,

NVIDIA, “NVIDIA Blackwell Architecture Technical Overview,” NVIDIA, Tech. Rep., 2025. [Online]. Available: https://resources. nvidia.com/en-us-blackwell-architecture

work page 2025
[4]

AMD CDNA 4 Architecture,

I. Advanced Micro Devices, “AMD CDNA 4 Architecture,” AMD, Tech. Rep., Oct. 2025. [Online]. Available: https: //www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/ white-papers/amd-cdna-4-architecture-whitepaper.pdf

work page 2025
[5]

FP8 Quantization: The Power of the Exponent,

A. Kuzmin, M. van Baalen, Y . Ren, M. Nagel, J. Peters, and T. Blankevoort, “FP8 Quantization: The Power of the Exponent,”Advances in Neural Information Processing Systems, vol. 35, pp. 14 651–14 662, Dec. 2022. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/hash/ 5e07476b6bd2497e1fbd11b8f0b2de3c-Abstract-Conference.html

work page 2022
[6]

arXiv preprint arXiv:2310.10537 , year=

B. D. Rouhani, R. Zhao, A. More, M. Hall, A. Khodamoradi, S. Deng, D. Choudhary, M. Cornea, E. Dellinger, K. Denolf, S. Dusan, V . Elango, M. Golub, A. Heinecke, P. James-Roxby, D. Jani, G. Kolhe, M. Langhammer, A. Li, L. Melnick, M. Mesmakhosroshahi, A. Rodriguez, M. Schulte, R. Shafipour, L. Shao, M. Siu, P. Dubey, P. Micikevicius, M. Naumov, C. Verrill...

work page arXiv 2023
[7]

Quantization,

R. M. Gray and D. L. Neuhoff, “Quantization,”IEEE Transactions on Information Theory, vol. 44, no. 6, pp. 2325–2383, 1998

work page 1998
[8]

Asymptotically efficient quantizing,

H. Gish and J. Pierce, “Asymptotically efficient quantizing,”IEEE Transactions on Information Theory, vol. 14, no. 5, pp. 676–683, Sep

work page
[9]

Available: https://ieeexplore.ieee.org/abstract/document/ 1054193

[Online]. Available: https://ieeexplore.ieee.org/abstract/document/ 1054193

work page
[10]

Data Compression With Low Distortion and Finite Blocklength,

V . Kostina, “Data Compression With Low Distortion and Finite Blocklength,”IEEE Transactions on Information Theory, vol. 63, no. 7, pp. 4268–4285, Jul. 2017. [Online]. Available: https: //ieeexplore.ieee.org/abstract/document/7867787

work page arXiv 2017
[11]

T. M. Cover and J. A. Thomas,Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing). USA: Wiley- Interscience, 2006. APPENDIXA BOUNDS ON THEKLDIVERGENCE TERM Now to prove Theorem 1. Its statement is copied here for the reader’s convenience. Theorem 1 (Bounds on the KL divergence):Under the convention0 log 0 = 0, assume Λ...

work page 2006
[12]

The lower bound is monotonically increasing inpwhile the upper bound is monotonically decreasing inp, and both converge to2 √ 2/3asp→ ∞, which is between1/ √ 2and √

work page
[13]

This means both bounds lie within[1/ √ 2, √ 2], so for within exponent block bins and exponent boundary bins, −1 2 ≤log ∆s(xj) ∆(xj) < 1 2 .(105) d) Bound on entropy difference from bin ratios:Hence, for everyx j ∈U\(S∪O), log ∆s(xj) ∆(xj) ≤ 1 2 .(106) The remaining points are exactly the overflow and underflow bins collected inS∪O, where we do not claim ...

work page
[14]

Fori >2 E+p−1, we set ui =u ′ i−2E+p−1 ,(120) which enumerates the positive values in increasing order

Therefore, fori≤2 E+p−1, we set ui =−u ′ 2E+p−1 −(i−1),(119) which enumerates the negative values in increasing order. Fori >2 E+p−1, we set ui =u ′ i−2E+p−1 ,(120) which enumerates the positive values in increasing order. With theseu i andK= 2 E+p, Theorem 4 can be used to compute H(X f p), completing the proof. ■ APPENDIXE EXACTFLOATING-POINTENTROPY VSA...

work page