Recognition: no theorem link
The Entropy of Floating-Point Numbers
Pith reviewed 2026-05-13 01:23 UTC · model grok-4.3
The pith
An analytic approximation for floating-point entropy reveals a scaling-invariant link to the underlying continuous distribution.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present an analytic approximation for the entropy of floating-point numbers along with bounds on the error of this approximation. It is well-known that the differential entropy is tightly linked to the discrete entropy of a uniformly quantized random variable. Our approximation uncovers a different quantity that provides this link for floating-point quantization. Additionally, we prove that the entropy of a floating-point quantized random variable is approximately unchanged under scaling. Closed-form expressions for the floating-point entropy of common distributions are provided and compared to exact results.
What carries the argument
The analytic approximation formula that expresses floating-point entropy in terms of properties of the continuous distribution, together with the scaling-invariance proof that follows from it.
If this is right
- Closed-form expressions become available for the entropy of common distributions under floating-point quantization.
- The approximation error can be bounded explicitly once the distribution parameters and floating-point format are known.
- Floating-point entropy remains nearly constant when the underlying variable is rescaled, so magnitude changes do not require re-deriving the entropy.
Where Pith is reading between the lines
- Quantization-aware algorithms could treat scale changes as information-neutral operations within the same floating-point format.
- The scaling result may simplify analysis of information loss when data passes through floating-point stages at different magnitudes.
Load-bearing premise
The approximation and its error bounds hold only when the continuous distribution and the floating-point format parameters satisfy certain unspecified conditions.
What would settle it
For a fixed distribution such as a standard normal, compute the exact floating-point entropy at two widely separated scales and check whether the difference stays inside the stated error bounds while the approximation itself remains accurate.
Figures
read the original abstract
Here we present an analytic approximation for the entropy of floating-point numbers, along with bounds on the error of this approximation. It is well-known that the differential entropy is tightly linked to the discrete entropy of a uniformly quantized random variable. Our approximation uncovers a different quantity that provides this link for floating-point quantization. Additionally, we prove that the entropy of a floating-point quantized random variable is approximately unchanged under scaling. Closed-form expressions for the floating-point entropy of common distributions are provided and compared to exact results.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents an analytic approximation, with error bounds, for the Shannon entropy of a continuous random variable after quantization to a floating-point representation. It identifies a distinct linking quantity (different from differential entropy) between this quantized entropy and the underlying continuous distribution. The authors prove that the resulting entropy is approximately invariant under scaling of the random variable and supply closed-form expressions for several common distributions (e.g., Gaussian), which are compared against exact numerical computations.
Significance. If the approximation and bounds hold under the stated conditions, the work supplies a practical bridge between differential and discrete entropy tailored to floating-point formats that dominate numerical computing. The approximate scale invariance is a notable distinction from uniform quantization results and could simplify entropy analysis in scaled or normalized data settings. The closed-form expressions together with direct comparisons to exact values add concrete utility and verifiability.
minor comments (3)
- [Abstract] Abstract: the phrasing 'entropy of floating-point numbers' is slightly loose; it would be clearer to state 'entropy of a random variable after floating-point quantization' to avoid any implication that the entropy is a property of the format itself rather than the quantized variable.
- [Numerical validation] The manuscript should explicitly list the floating-point parameters (mantissa bits, exponent bias/range) used for the closed-form comparisons in the numerical validation section so that readers can reproduce the exact results.
- [Main derivation] Notation for the linking quantity introduced for floating-point quantization should be defined once in the main text and used consistently thereafter to prevent any ambiguity with standard differential entropy.
Simulated Author's Rebuttal
We thank the referee for the positive summary of our work and the recommendation of minor revision. The referee's description accurately reflects the paper's contributions regarding the analytic approximation, error bounds, linking quantity, approximate scale invariance, and closed-form expressions for common distributions.
Circularity Check
No significant circularity; analytic derivation with external validation
full rationale
The paper presents an analytic approximation for floating-point entropy along with explicit error bounds, identifies a distinct linking quantity to differential entropy specific to FP quantization, proves approximate scale invariance of the quantized entropy, and supplies closed-form expressions for common distributions that are directly compared to exact results. These elements indicate a self-contained derivation chain that stands independently of its inputs and can be falsified or confirmed against external exact computations and stated conditions on the density and FP parameters. No load-bearing step reduces by construction to a fit, self-definition, or self-citation chain.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
J. M. Muller,Handbook of floating-point arithmetic / Jean-Michel Muller [and others].Boston: Birkhauser, 2010
work page 2010
-
[2]
What every computer scientist should know about floating-point arithmetic,
D. Goldberg, “What every computer scientist should know about floating-point arithmetic,”ACM Comput. Surv., vol. 23, no. 1, p. 5–48, Mar. 1991. [Online]. Available: https://doi.org/10.1145/103162.103163
-
[3]
NVIDIA Blackwell Architecture Technical Overview,
NVIDIA, “NVIDIA Blackwell Architecture Technical Overview,” NVIDIA, Tech. Rep., 2025. [Online]. Available: https://resources. nvidia.com/en-us-blackwell-architecture
work page 2025
-
[4]
I. Advanced Micro Devices, “AMD CDNA 4 Architecture,” AMD, Tech. Rep., Oct. 2025. [Online]. Available: https: //www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/ white-papers/amd-cdna-4-architecture-whitepaper.pdf
work page 2025
-
[5]
FP8 Quantization: The Power of the Exponent,
A. Kuzmin, M. van Baalen, Y . Ren, M. Nagel, J. Peters, and T. Blankevoort, “FP8 Quantization: The Power of the Exponent,”Advances in Neural Information Processing Systems, vol. 35, pp. 14 651–14 662, Dec. 2022. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/hash/ 5e07476b6bd2497e1fbd11b8f0b2de3c-Abstract-Conference.html
work page 2022
-
[6]
arXiv preprint arXiv:2310.10537 , year=
B. D. Rouhani, R. Zhao, A. More, M. Hall, A. Khodamoradi, S. Deng, D. Choudhary, M. Cornea, E. Dellinger, K. Denolf, S. Dusan, V . Elango, M. Golub, A. Heinecke, P. James-Roxby, D. Jani, G. Kolhe, M. Langhammer, A. Li, L. Melnick, M. Mesmakhosroshahi, A. Rodriguez, M. Schulte, R. Shafipour, L. Shao, M. Siu, P. Dubey, P. Micikevicius, M. Naumov, C. Verrill...
-
[7]
R. M. Gray and D. L. Neuhoff, “Quantization,”IEEE Transactions on Information Theory, vol. 44, no. 6, pp. 2325–2383, 1998
work page 1998
-
[8]
Asymptotically efficient quantizing,
H. Gish and J. Pierce, “Asymptotically efficient quantizing,”IEEE Transactions on Information Theory, vol. 14, no. 5, pp. 676–683, Sep
-
[9]
Available: https://ieeexplore.ieee.org/abstract/document/ 1054193
[Online]. Available: https://ieeexplore.ieee.org/abstract/document/ 1054193
-
[10]
Data Compression With Low Distortion and Finite Blocklength,
V . Kostina, “Data Compression With Low Distortion and Finite Blocklength,”IEEE Transactions on Information Theory, vol. 63, no. 7, pp. 4268–4285, Jul. 2017. [Online]. Available: https: //ieeexplore.ieee.org/abstract/document/7867787
-
[11]
T. M. Cover and J. A. Thomas,Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing). USA: Wiley- Interscience, 2006. APPENDIXA BOUNDS ON THEKLDIVERGENCE TERM Now to prove Theorem 1. Its statement is copied here for the reader’s convenience. Theorem 1 (Bounds on the KL divergence):Under the convention0 log 0 = 0, assume Λ...
work page 2006
-
[12]
The lower bound is monotonically increasing inpwhile the upper bound is monotonically decreasing inp, and both converge to2 √ 2/3asp→ ∞, which is between1/ √ 2and √
-
[13]
This means both bounds lie within[1/ √ 2, √ 2], so for within exponent block bins and exponent boundary bins, −1 2 ≤log ∆s(xj) ∆(xj) < 1 2 .(105) d) Bound on entropy difference from bin ratios:Hence, for everyx j ∈U\(S∪O), log ∆s(xj) ∆(xj) ≤ 1 2 .(106) The remaining points are exactly the overflow and underflow bins collected inS∪O, where we do not claim ...
-
[14]
Therefore, fori≤2 E+p−1, we set ui =−u ′ 2E+p−1 −(i−1),(119) which enumerates the negative values in increasing order. Fori >2 E+p−1, we set ui =u ′ i−2E+p−1 ,(120) which enumerates the positive values in increasing order. With theseu i andK= 2 E+p, Theorem 4 can be used to compute H(X f p), completing the proof. ■ APPENDIXE EXACTFLOATING-POINTENTROPY VSA...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.