arxiv: 2605.13768 · v1 · submitted 2026-05-13 · 💻 cs.LG · cs.AI· cs.IT· math.IT

Recognition: unknown

High-Rate Quantized Matrix Multiplication II

Or Ordentlich , Yury Polyanskiy

Authors on Pith no claims yet

Pith reviewed 2026-05-14 19:26 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.ITmath.IT

keywords quantized matrix multiplicationLLM post-training quantizationwaterfillingweighted mean squared errorhigh-rate distortionGPTQscalar quantizationcovariance matrix

0 comments

The pith

WaterSIC uses waterfilling on the input covariance to make scalar quantization of LLM weights basis-independent and within 0.25 bit per entry of the information-theoretic limit.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines quantized matrix multiplication when the covariance matrix of the second factor is known, a setting common in weight-only post-training quantization of large language models. It links the task to weighted mean squared error source coding and shows that waterfilling allocates rates across coordinates to minimize distortion. The WaterSIC scheme, built from scalar integer quantizers, achieves high-rate performance fixed by the determinant of the covariance matrix alone. This makes the distortion immune to random basis rotations and keeps it within a factor of 2πe/12 of the optimal rate-distortion bound. GPTQ with random rotations performs nearly as well on real models such as Llama-3-8B.

Core claim

When the covariance matrix Σ_X is known, waterfilling applied to its eigenvalues produces a quantization scheme whose high-rate distortion depends only on det(Σ_X) and lies within a multiplicative factor of 2πe/12 of the information-theoretic minimum distortion for the weighted mean squared error problem.

What carries the argument

Waterfilling rate allocation on the eigenvalues of Σ_X to set per-coordinate step sizes for scalar integer quantizers.

Load-bearing premise

The high-rate regime approximation accurately describes the distortion achieved at the bit widths used in practical LLM quantization, with the covariance matrix known and stationary.

What would settle it

Direct computation of quantization MSE on Llama-3-8B layer weights at 4 bits per entry, compared against the value predicted from det(Σ_X) scaled by the 2πe/12 factor.

Figures

Figures reproduced from arXiv: 2605.13768 by Or Ordentlich, Yury Polyanskiy.

**Figure 1.** Figure 1: Illustrating ΣX of activations entering various layers of Llama-3-8B when processing Wikitext-2 dataset. Note that this is an estimate of the rate advantage assumes weight matrices are well modeled by N (0, In). In particular, actual weight matrices were never used for this plot. Note that, while our argument only gives a lower bound on the minimal distortion, it can be shown to be achievable asymptotical… view at source ↗

**Figure 2.** Figure 2: In generalSIC the n codebooks C1, . . . , Cn ⊂ R whose product is C ⊂ R n can be arbitrary, and each of them is further scaled by the corresponding αi . This scaling can in general be absorbed in the codebooks definition. However, since we allow A to depend on ΣX through the matrix U ∈ R n×n while the codebooks C1, . . . , Cn are not allowed to depend on ΣX, we do not absorb A into the codebooks. In genera… view at source ↗

**Figure 2.** Figure 2: Illustration of the generalSIC quantization algorithm. The matrix [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the quantization regions for the [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Performance of several weight-only quantization [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Illustrating rate advantage of WaterSIC over SIC for [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Illustrating Cholesky diagonals U 2 k,k for a randomly rotated V ⊤ΣXV and accuracy of approximation (33) in terms of spectrum of ΣX. GPTQ with random rotation can be accurately estimated from combining (10) and (33). It is an interesting open problem to estimate worst possible gap (over possible spectra λj ≥ ϵ) between GPTQ with rotation and WaterSIC (which in turn is 0.25-bit away from informationtheore… view at source ↗

read the original abstract

This is the second part of the work investigating quantized matrix multiplication (MatMul). In part I we considered the case of calibration-free quantization, whereas here we discuss the setting where covariance matrix $\Sigma_X$ of the columns of the second factor is available. This setting arises in the ubiquitous task of weight-only post-training quantization of LLMs. Weight-only quantization is related to the problem of weighted mean squared error (WMSE) source coding, whose classical (reverse) waterfilling solution dictates how one should distribute rate between coordinates of the vector. We show how waterfilling can be used to improve practical LLM quantization algorithms (GPTQ), which at present allocate rate equally. A recent scheme (known as ``WaterSIC'') that only uses scalar INT quantizers is analyzed and its high-rate performance is shown to be (a) basis free (i.e., characterized by the determinant of $\Sigma_X$ and, thus, unlike existing schemes, is immune to applying random rotations); and (b) within a multiplicative factor of $\frac{2\pi e}{12}$ (or 0.25 bit/entry) of the information-theoretic distortion limit. GPTQ's performance, in turn, is affected by the choice of basis, but for a random rotation and actual $\Sigma_X$ from Llama-3-8B we find it to be within 0.1 bit (depending on the layer type) of WaterSIC, suggesting that GPTQ with random rotation is also near optimal, at least in the high-rate regime.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

WaterSIC is basis-free via det(Σ_X) and within 0.25 bit of the RD limit in high-rate analysis, with GPTQ close on Llama-3-8B, but the approximation's fit to 3-5 bit rates is unverified.

read the letter

The main thing to know is that this paper gives a clean high-rate analysis of WaterSIC for weight-only LLM quantization. It shows performance depends only on the determinant of the covariance matrix, so the scheme is invariant to rotations, and sits within a multiplicative factor of 2πe/12 of the Gaussian rate-distortion bound. GPTQ with random rotation lands within 0.1 bit of that on actual Llama-3-8B layers. The work connects classical reverse waterfilling to practical algorithms by showing how to allocate bits non-uniformly across coordinates instead of the uniform allocation now in use. That link is useful and the basis-free property is a clear distinction from other schemes. The math rests on standard results, which is fine, and the empirical comparison uses real model weights rather than synthetic data. The soft spot is the high-rate regime. The 0.25 bit gap assumes quantization noise is white and additive with rate going to infinity, but practical post-training quantization runs at 3-5 bits per coordinate after waterfilling. Overload, non-uniform loading, and higher moments of the marginals are not quantified, so the actual gap could be larger. Evaluation on a single model family also limits how far the near-optimality claim travels. This is for readers who want information-theoretic grounding of quantization methods for neural nets. The theoretical part is sharp enough that the paper deserves a serious referee, though the authors should add finite-rate bounds or broader experiments to make the practical claims stick.

Referee Report

3 major / 2 minor

Summary. The paper is the second part of a study on quantized matrix multiplication, focusing on the setting where the covariance matrix Σ_X of the input columns is known (as arises in weight-only post-training quantization of LLMs). It shows how classical reverse waterfilling can be applied to improve rate allocation in schemes such as GPTQ, and analyzes the WaterSIC scheme (scalar INT quantizers) whose high-rate performance is claimed to be (a) fully characterized by det(Σ_X) and therefore basis-independent, and (b) within a multiplicative factor of 2πe/12 (≈0.25 bit per entry) of the Gaussian rate-distortion bound. Experiments on Llama-3-8B layers indicate that randomly rotated GPTQ lies within 0.1 bit of WaterSIC, suggesting near-optimality in the high-rate regime.

Significance. If the high-rate analysis and finite-rate experiments hold, the work supplies a clean theoretical link between classical reverse-waterfilling source coding and practical LLM quantization. The basis-free characterization of WaterSIC and the explicit 0.25-bit gap to the information-theoretic limit are useful benchmarks; the observation that rotated GPTQ is already close to this limit supplies a concrete, low-overhead route to near-optimal weight-only quantization. The manuscript also demonstrates the value of importing classical rate-distortion tools into the LLM compression literature.

major comments (3)

[High-rate analysis] High-rate analysis section (around the derivation of the 2πe/12 gap): the claim that WaterSIC lies within 0.25 bit/entry of the rate-distortion limit rests on the R→∞ regime in which quantization noise is white, additive, and independent of higher-order moments. At the 3–5 bit per coordinate rates typical after waterfilling for Llama-3-8B layers, overload probability, non-uniform bin loading, and marginal kurtosis can enlarge the gap; the manuscript provides no finite-rate error analysis or simulation that quantifies how much the gap inflates at these rates.
[Experiments] Experimental evaluation (Llama-3-8B results): the comparison between WaterSIC and rotated GPTQ is performed on a single model and a limited set of layers. Because the gap to the information-theoretic bound is asserted to be small (0.1 bit), the result is sensitive to the particular covariance structure of Llama-3-8B; a broader test across multiple model families and bit-widths is needed to support the general claim that rotated GPTQ is near-optimal.
[Basis independence] Section on basis independence: the statement that WaterSIC performance depends only on det(Σ_X) is derived under the high-rate white-noise approximation. It is not immediately clear whether the same invariance holds once finite-rate effects (granularity, overload) are included; a short counter-example or additional derivation at moderate rates would strengthen the claim.

minor comments (2)

[Abstract] The abstract and introduction should explicitly state the bit-width range (e.g., 3–5 bits after waterfilling) for which the 0.25-bit gap is claimed to remain representative.
[WaterSIC description] Notation: the relationship between the waterfilling solution for the weighted MSE problem and the scalar quantizer step sizes used in WaterSIC could be written out more explicitly (one additional displayed equation would suffice).

Simulated Author's Rebuttal

3 responses · 2 unresolved

We thank the referee for the constructive comments, which help clarify the scope and limitations of our high-rate analysis. We address each major point below, indicating planned revisions where appropriate.

read point-by-point responses

Referee: [High-rate analysis] High-rate analysis section (around the derivation of the 2πe/12 gap): the claim that WaterSIC lies within 0.25 bit/entry of the rate-distortion limit rests on the R→∞ regime in which quantization noise is white, additive, and independent of higher-order moments. At the 3–5 bit per coordinate rates typical after waterfilling for Llama-3-8B layers, overload probability, non-uniform bin loading, and marginal kurtosis can enlarge the gap; the manuscript provides no finite-rate error analysis or simulation that quantifies how much the gap inflates at these rates.

Authors: We agree that the 2πe/12 gap is derived under the high-rate white-noise approximation. At the 3-5 bit rates relevant to Llama-3-8B after waterfilling, finite-rate effects such as overload and kurtosis can indeed increase the actual gap. We will revise the manuscript to explicitly state this assumption and its limitations, and we will add a short discussion noting that the experimental results (0.1 bit gap for rotated GPTQ) provide empirical evidence that the inflation remains modest in practice. A full finite-rate analysis is beyond the current scope. revision: partial
Referee: [Experiments] Experimental evaluation (Llama-3-8B results): the comparison between WaterSIC and rotated GPTQ is performed on a single model and a limited set of layers. Because the gap to the information-theoretic bound is asserted to be small (0.1 bit), the result is sensitive to the particular covariance structure of Llama-3-8B; a broader test across multiple model families and bit-widths is needed to support the general claim that rotated GPTQ is near-optimal.

Authors: The experiments use Llama-3-8B to illustrate behavior on a recent, representative LLM under realistic covariance structures. The claim is presented as suggestive rather than universal. We will add text clarifying the limited scope and that the near-optimality observation holds for this model family in the high-rate regime. Broader validation across additional models is desirable but cannot be completed in the current revision cycle. revision: no
Referee: [Basis independence] Section on basis independence: the statement that WaterSIC performance depends only on det(Σ_X) is derived under the high-rate white-noise approximation. It is not immediately clear whether the same invariance holds once finite-rate effects (granularity, overload) are included; a short counter-example or additional derivation at moderate rates would strengthen the claim.

Authors: The determinant characterization follows directly from the high-rate analysis where the quantization noise is white and the distortion depends only on the eigenvalues via waterfilling. We will revise the section to note that this invariance is approximate at finite rates and may be perturbed by granularity and overload. A full moderate-rate derivation or counter-example is left for future work, but the high-rate result remains a useful benchmark. revision: partial

standing simulated objections not resolved

A complete finite-rate error analysis quantifying gap inflation at 3-5 bits per coordinate
Broader experimental validation across multiple model families and bit-widths

Circularity Check

0 steps flagged

No significant circularity; claims rest on classical information-theoretic results

full rationale

The paper applies the standard reverse waterfilling solution from classical rate-distortion theory to allocate rates based on the eigenvalues of Σ_X and invokes the known high-rate scalar quantization gap of 2πe/12 relative to the Gaussian bound. These are independent external results, not derived or redefined inside the paper. The basis-free characterization via det(Σ_X) follows directly from the waterfilling formula without redefinition. GPTQ comparisons rely on external Llama-3-8B weights rather than any fitted parameters or self-citation chains. No load-bearing step reduces by construction to the paper's own inputs or prior self-citations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis rests on the classical reverse waterfilling theorem for weighted MSE source coding and the high-rate quantization approximation; no new free parameters or invented entities are introduced in the abstract.

axioms (1)

standard math Reverse waterfilling solution optimally distributes rate for weighted mean squared error source coding
Invoked to improve rate allocation over uniform allocation in GPTQ.

pith-pipeline@v0.9.0 · 5580 in / 1247 out tokens · 46165 ms · 2026-05-14T19:26:02.994717+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · 2 internal anchors

[1]

High-rate quantized matrix multiplication I,

O. Ordentlich and Y . Polyanskiy, “High-rate quantized matrix multiplication I,” 2026

work page 2026
[2]

Optimal quantization for matrix multiplication,

——, “Optimal quantization for matrix multiplication,”arXiv preprint arXiv:2410.13780, 2024

work page arXiv 2024
[3]

Up or down? adaptive rounding for post-training quantization,

M. Nagel, R. A. Amjad, M. Van Baalen, C. Louizos, and T. Blankevoort, “Up or down? adaptive rounding for post-training quantization,” inInternational conference on machine learning. PMLR, 2020, pp. 7197–7206

work page 2020
[4]

OPTQ: Accurate quantization for generative pre-trained transformers,

E. Frantar, S. Ashkboos, T. Hoefler, and D. Alistarh, “OPTQ: Accurate quantization for generative pre-trained transformers,” inThe Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https: //openreview.net/forum?id=tcbBPnfwxS

work page 2023
[5]

Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale,

T. Dettmers, M. Lewis, Y . Belkada, and L. Zettlemoyer, “Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale,” Advances in Neural Information Processing Systems, vol. 35, pp. 30 318–30 332, 2022

work page 2022
[6]

Optimal brain surgeon and general network pruning,

B. Hassibi, D. G. Stork, and G. J. Wolff, “Optimal brain surgeon and general network pruning,” inIEEE international conference on neural networks. IEEE, 1993, pp. 293–299

work page 1993
[7]

Watersic: information-theoretically (near) optimal linear layer quantiza- tion,

E. Lifar, S. Savkin, O. Ordentlich, and Y . Polyanskiy, “Watersic: information-theoretically (near) optimal linear layer quantiza- tion,”arXiv preprint arXiv:2603.04956, 2026

work page arXiv 2026
[8]

Brecq: Pushing the limit of post- training quantization by block reconstruction,

Y . Li, R. Gong, X. Tan, Y . Yang, P. Hu, Q. Zhang, F. Yu, W. Wang, and S. Gu, “Brecq: Pushing the limit of post- training quantization by block reconstruction,”arXiv preprint arXiv:2102.05426, 2021

work page arXiv 2021
[9]

Model-preserving adaptive rounding,

A. Tseng, Z. Sun, and C. De Sa, “Model-preserving adaptive rounding,”arXiv preprint arXiv:2505.22988, 2025

work page arXiv 2025
[10]

Half-quadratic quantization of large machine learning models,

H. Badri and A. Shaji, “Half-quadratic quantization of large machine learning models,” November 2023. [Online]. Available: https://mobiusml.github.io/hqq_blog/

work page 2023
[11]

Smoothquant: Accurate and efficient post-training quantization for large language models,

G. Xiao, J. Lin, M. Seznec, H. Wu, J. Demouth, and S. Han, “Smoothquant: Accurate and efficient post-training quantization for large language models,” inInternational Conference on Machine Learning. PMLR, 2023, pp. 38 087–38 099

work page 2023
[12]

NestQuant: Nested lattice quantization for matrix products and LLMs,

S. Savkin, E. Porat, O. Ordentlich, and Y . Polyanskiy, “NestQuant: Nested lattice quantization for matrix products and LLMs,”arXiv preprint arXiv:2502.09720, 2025

work page arXiv 2025
[13]

Qronos: Correcting the past by shaping the future... in post-training quantization,

S. Zhang, H. Zhang, I. Colbert, and R. Saab, “Qronos: Correcting the past by shaping the future... in post-training quantization,” arXiv preprint arXiv:2505.11695, 2025

work page arXiv 2025
[14]

Provable Post-Training Quantization: Theoretical Analysis of OPTQ and Qronos

H. Zhang, S. Zhang, I. Colbert, and R. Saab, “Provable post- training quantization: Theoretical analysis of optq and qronos,” arXiv preprint arXiv:2508.04853, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[15]

Polyanskiy and Y

Y . Polyanskiy and Y . Wu,Information theory: From coding to learning. Cambridge university press, 2024

work page 2024
[16]

Price of metric universality in vector quantization is at most 0.11 bit,

A. Harbuzova, O. Ordentlich, and Y . Polyanskiy, “Price of metric universality in vector quantization is at most 0.11 bit,”arXiv preprint arXiv:2602.05790, 2026

work page arXiv 2026
[17]

Quip: 2- bit quantization of large language models with guarantees,

J. Chee, Y . Cai, V . Kuleshov, and C. M. De Sa, “Quip: 2- bit quantization of large language models with guarantees,” Advances in Neural Information Processing Systems, vol. 36, pp. 4396–4429, 2023

work page 2023
[18]

The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm

J. Chen, Y . Shabanzadeh, E. Crnˇcevi´c, T. Hoefler, and D. Alistarh, “The geometry of llm quantization: Gptq as babai’s nearest plane algorithm,”arXiv preprint arXiv:2507.18553, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[19]

The lattice geometry of neural network quantization– a short equivalence proof of gptq and babai’s algorithm,

J. Birnick, “The lattice geometry of neural network quantization– a short equivalence proof of gptq and babai’s algorithm,”arXiv preprint arXiv:2508.01077, 2025

work page arXiv 2025
[20]

Zamir,Lattice Coding for Signals and Networks: A Structured Coding Approach to Quantization, Modulation, and Multiuser Information Theory

R. Zamir,Lattice Coding for Signals and Networks: A Structured Coding Approach to Quantization, Modulation, and Multiuser Information Theory. Cambridge University Press, 2014

work page 2014
[21]

Quip#: Even better llm quantization with hadamard incoherence and lattice codebooks,

A. Tseng, J. Chee, Q. Sun, V . Kuleshov, and C. De Sa, “Quip#: Even better llm quantization with hadamard incoherence and lattice codebooks,”arXiv preprint arXiv:2402.04396, 2024

work page arXiv 2024
[22]

Bounds on the density of smooth lattice coverings,

O. Ordentlich, O. Regev, and B. Weiss, “Bounds on the density of smooth lattice coverings,”arXiv preprint arXiv:2311.04644, 2023

work page arXiv 2023
[23]

Spectra of quantized signals,

W. R. Bennett, “Spectra of quantized signals,”The Bell System Technical Journal, vol. 27, no. 3, pp. 446–472, 1948

work page 1948
[24]

Quantization distortion in pulse-count modulation with nonuniform spacing of levels,

P. Panter and W. Dite, “Quantization distortion in pulse-count modulation with nonuniform spacing of levels,”Proceedings of the IRE, vol. 39, no. 1, pp. 44–48, 1951

work page 1951
[25]

Asymptotic quantization error of continuous signals and the quantization dimension,

P. Zador, “Asymptotic quantization error of continuous signals and the quantization dimension,”IEEE Transactions on Informa- tion Theory, vol. 28, no. 2, pp. 139–149, 1982

work page 1982
[26]

Asymptotically optimal block quantization,

A. Gersho, “Asymptotically optimal block quantization,”IEEE Transactions on information theory, vol. 25, no. 4, pp. 373–380, 1979

work page 1979
[27]

Lattice and trellis quantiza- tion with lattice-and trellis-bounded codebooks-high-rate theory for memoryless sources,

M. V . Eyuboglu and G. D. Forney, “Lattice and trellis quantiza- tion with lattice-and trellis-bounded codebooks-high-rate theory for memoryless sources,”IEEE Transactions on Information theory, vol. 39, no. 1, pp. 46–59, 1993

work page 1993
[28]

On lattice quantization noise,

R. Zamir and M. Feder, “On lattice quantization noise,”IEEE Transactions on Information Theory, vol. 42, no. 4, pp. 1152– 1159, 1996

work page 1996
[29]

J. H. Conway and N. J. A. Sloane,Sphere Packings, Lattices and Groups, 3rd ed., ser. Grundlehren der mathematischen Wis- senschaften. New York: Springer-Verlag, 1999, vol. 290

work page 1999
[30]

The Voronoi spherical cdf for lattices and linear codes: New bounds for quantization and coding,

O. Ordentlich, “The Voronoi spherical cdf for lattices and linear codes: New bounds for quantization and coding,”arXiv preprint arXiv:2506.19791, 2025

work page arXiv 2025
[31]

V-blast: An architecture for realizing very high data rates over the rich-scattering wireless channel,

P. W. Wolniansky, G. J. Foschini, G. D. Golden, and R. A. Valenzuela, “V-blast: An architecture for realizing very high data rates over the rich-scattering wireless channel,” in1998 URSI international symposium on signals, systems, and electronics. Conference proceedings (Cat. No. 98EX167). IEEE, 1998, pp. 295–300

work page 1998
[32]

On lovász’lattice reduction and the nearest lattice point problem,

L. Babai, “On lovász’lattice reduction and the nearest lattice point problem,”Combinatorica, vol. 6, no. 1, pp. 1–13, 1986

work page 1986
[33]

Improved methods for calculating vectors of short length in a lattice, including a complexity analysis,

U. Fincke and M. Pohst, “Improved methods for calculating vectors of short length in a lattice, including a complexity analysis,”Mathematics of computation, vol. 44, no. 170, pp. 463– 471, 1985

work page 1985
[34]

Closest point search in lattices,

E. Agrell, T. Eriksson, A. Vardy, and K. Zeger, “Closest point search in lattices,”IEEE transactions on information theory, vol. 48, no. 8, pp. 2201–2214, 2002

work page 2002
[35]

Trellis shaping,

G. D. Forney, “Trellis shaping,”IEEE Transactions on Informa- tion Theory, vol. 38, no. 2, pp. 281–300, 1992

work page 1992
[36]

Nestquant: Nested lattice quantization for matrix products and llms,

S. Savkin, E. Porat, O. Ordentlich, and Y . Polyanskiy, “Nestquant: Nested lattice quantization for matrix products and llms,”Proc. International Conference on Machine Learning (ICML), 2025

work page 2025
[37]

Qtip: Quantiza- tion with trellises and incoherence processing,

A. Tseng, Q. Sun, D. Hou, and C. De Sa, “Qtip: Quantiza- tion with trellises and incoherence processing,”arXiv preprint arXiv:2406.11235, 2024

work page arXiv 2024
[38]

Privileged bases in the transformer residual stream,

N. Elhage, R. Lasenby, and C. Olah, “Privileged bases in the transformer residual stream,”Transformer Circuits Thread,

work page
[39]

Available: https://transformer-circuits.pub/2023/ privileged-basis/index.html

[Online]. Available: https://transformer-circuits.pub/2023/ privileged-basis/index.html

work page 2023
[40]

On the best lattice quantizers,

E. Agrell and B. Allen, “On the best lattice quantizers,”IEEE Transactions on Information Theory, vol. 69, no. 12, pp. 7650– 7658, 2023

work page 2023