pith. machine review for the scientific record. sign in

arxiv: 2604.03634 · v4 · submitted 2026-04-04 · 💻 cs.LG · cs.IT· eess.SP· math.IT

Recognition: 2 theorem links

· Lean Theorem

Algebraic Diversity: Group-Theoretic Spectral Estimation from Single Observations

Mitchell A. Thornton

Pith reviewed 2026-05-13 18:49 UTC · model grok-4.3

classification 💻 cs.LG cs.ITeess.SPmath.IT
keywords group-theoretic estimationsingle-snapshot averagingsubspace decompositionalgebraic averagingrepresentation symmetryprocessing gainspectral estimationreplacement theorem
0
0 comments X

The pith

Group averaging applied to a single observation can match the subspace estimation of traditional multi-observation covariance methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that temporal averaging over many independent observations is merely the special case of algebraic group action when the group is trivial. By applying a chosen group of transformations to one snapshot, an estimator achieves the same subspace decomposition that would otherwise require multiple snapshots. A sympathetic reader would care because this reframes data collection requirements in signal processing: performance gains derive from the order of the symmetry group rather than the number of separate measurements. The approach unifies classical transforms such as the DFT as instances of matching the data's underlying group structure and extends conjecturally to arbitrary statistics.

Core claim

Temporal averaging over multiple observations is the degenerate case of algebraic group action with the trivial group G={e}. A General Replacement Theorem proves that a group-averaged estimator from one snapshot achieves equivalent subspace decomposition to multi-snapshot covariance estimation. The Trivial Group Embedding Theorem shows that the sample covariance is the accumulation of trivial-group estimates whose variance is governed by a (G,L) continuum as 1/(|G|·L). Processing gain equals classical beamforming gain and is therefore a property of group order. The DFT, DCT, and KLT arise as group-matched special cases. Monte Carlo experiments on the first four sample moments across five 그룹s

What carries the argument

The General Replacement Theorem, which equates a single-snapshot group-averaged estimator to the subspace decomposition obtained from multi-snapshot covariance estimation by leveraging representation-theoretic symmetry.

If this is right

  • Processing gain scales with group order |G| and equals 10 log10(|G|) dB independent of sensor count.
  • The DFT, DCT, and KLT are recovered as special cases of group-matched estimators.
  • Sample covariance arises exactly as the limit of trivial-group accumulations.
  • Single-snapshot methods become viable for MUSIC, massive MIMO, waveform classification, graph signals, and transformer analysis.
  • Variance reduction for arbitrary statistics is conjectured to follow the effective group order d_eff.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The framework may support real-time estimation on non-stationary signals where repeated snapshots cannot be collected.
  • Automatic or blind selection of the group G could yield adaptive algorithms for data whose symmetry is unknown in advance.
  • In learning systems, explicit symmetry groups might reduce the number of training examples needed by exploiting structural regularities rather than statistical volume.

Load-bearing premise

The data object possesses representation-theoretic symmetry under the chosen group such that the group action preserves the relevant subspace structure.

What would settle it

Controlled experiments in which the group-averaged single-snapshot subspace estimate deviates from the multi-snapshot covariance estimate by more than the predicted variance scaling 1/(|G|·L), or Monte Carlo runs on sample moments that fail to match the conjectured d_eff scaling to four-digit precision.

Figures

Figures reproduced from arXiv: 2604.03634 by Mitchell A. Thornton.

Figure 1
Figure 1. Figure 1: Eigenvalue SNR versus number of permutations [PITH_FULL_IMAGE:figures/full_fig_p018_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Massive MIMO: AD vs. MMSE at SNR = 15 dB, K = 4 users. (a) Effective throughput vs. M for three CDL channel models. Dashed: MMSE; solid: AD. (b) Percentage gain of AD over MMSE. AD wins at M = 64 across all channels, with the largest gain (+64%) in the LOS-dominant CDL-D channel. 2) LOS channels favor AD. CDL-D (LOS￾dominant, narrow angular spread) is AD’s strongest regime: the channel’s spatial structure … view at source ↗
Figure 3
Figure 3. Figure 3: Single-pulse chirp characterization via the [PITH_FULL_IMAGE:figures/full_fig_p024_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: SNR robustness of the chirp-adapted group [PITH_FULL_IMAGE:figures/full_fig_p024_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Four-class single-pulse waveform classification [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: SNR threshold comparison for single-pulse clas [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Non-stationary modulated source scenario [PITH_FULL_IMAGE:figures/full_fig_p027_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Chirp characterization against a non-stationary [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Commutativity residual δ for five generic per￾mutation generators tested against graph-diffusion covari￾ance on six graphs. Blue bars: generators that are graph automorphisms (δ = 0). Red bars: generators that are not automorphisms (δ > 0). In all cases, δ separates auto￾morphisms from non-automorphisms, and the generator with minimum δ is an automorphism (Theorem 25). Proposition 26 (Permutation Commutat… view at source ↗
Figure 12
Figure 12. Figure 12: Three candidate graphs from the system￾atic filtering pipeline ( [PITH_FULL_IMAGE:figures/full_fig_p032_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Three strongest candidate graphs from the [PITH_FULL_IMAGE:figures/full_fig_p032_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Deep analysis of graph C5 (K4 clique + pen￾dant, Aut ∼= S3). (a) Conjugation stress test: S3 ad￾vantage persists at +17% even with 500 random con￾jugation candidates; the Laplacian eigenvector conjuga￾tion performs poorly (ψ = 0.35). (b) SNR sweep: the advantage emerges above 0 dB, stabilizes at +21% for SNR ≥ 15 dB, and persists at 30 dB (structural, not noise artifact). (c) Eigenvalue anatomy: S3 concen… view at source ↗
Figure 18
Figure 18. Figure 18: Single-snapshot AD spectrum (red) vs. L = 100 averaged spectrum (blue) for M = 64 with three embedded signals. The true KL spectrum is shown in gray. AD from one observation achieves spectral resolu￾tion comparable to 100-snapshot averaging. 5 0 5 10 15 20 25 30 Input SNR (dB) 0.00 0.02 0.04 0.06 0.08 0.10 0.12 (dim e n sio nle s s) Commutativity residual 5 0 5 10 15 20 25 30 Input SNR (dB) 0.32 0.34 0.36… view at source ↗
Figure 19
Figure 19. Figure 19: Commutativity residual δ (left) and absolute mismatch ˜δ (right) vs. SNR for G = Z8, M = 8. The scale-invariant δ is flat; the energy-weighted ˜δ grows with SNR. at the receiver array. Current 3GPP approaches handle this with interference rejection combining (IRC), which is a form of pre-whitening. The group-theoretic noise characterization provides a structured alternative: if the interference has regula… view at source ↗
read the original abstract

We establish that temporal averaging over multiple observations is the degenerate case of algebraic group action with the trivial group $G=\{e\}$. A General Replacement Theorem proves that a group-averaged estimator from one snapshot achieves equivalent subspace decomposition to multi-snapshot covariance estimation. The Trivial Group Embedding Theorem proves that the sample covariance is the accumulation of trivial-group estimates, with variance governed by a $(G,L)$ continuum as $1/(|G|\cdot L)$. The processing gain $10\log_{10}(M)$ dB equals the classical beamforming gain, establishing that this gain is a property of group order, not sensor count. The DFT, DCT, and KLT are unified as group-matched special cases. We conjecture a General Algebraic Averaging Theorem extending these results to arbitrary statistics, with variance governed by the effective group order $d_{\mathrm{eff}}$. Monte Carlo experiments on the first four sample moments across five group types confirm the conjecture to four-digit precision. The framework exploits the $structure$ of information (representation-theoretic symmetry of the data object) rather than the content, complementing Shannon's theory. Five applications are demonstrated: single-snapshot MUSIC, massive MIMO, single-pulse waveform classification, graph signal processing, and analysis of transformer LLMs. Techniques for blind group matching are described.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript claims that temporal averaging over multiple observations is a special case of algebraic averaging under the trivial group G={e}. It introduces a General Replacement Theorem asserting that a single group-averaged snapshot yields equivalent subspace decomposition to multi-snapshot covariance estimation, a Trivial Group Embedding Theorem showing that the sample covariance accumulates trivial-group estimates with variance scaling as 1/(|G|·L), and unifies the DFT, DCT, and KLT as group-matched cases. A General Algebraic Averaging Theorem is conjectured for arbitrary statistics with variance governed by effective group order d_eff, supported by Monte Carlo confirmation of the first four sample moments to four-digit precision across five group types. The framework is applied to single-snapshot MUSIC, massive MIMO, waveform classification, graph signal processing, and transformer LLMs, emphasizing representation-theoretic symmetry over data content.

Significance. If the Replacement and Embedding Theorems hold under the stated conditions, the work provides a unifying algebraic perspective on spectral estimation that recasts classical processing gain as a function of group order rather than sensor count, with potential to enable single-observation methods in data-scarce regimes. The Monte Carlo validation to four-digit precision on moments and the explicit unification of DFT/DCT/KLT constitute concrete, reproducible support for the conjecture. The approach complements Shannon theory by exploiting algebraic structure and is demonstrated across five distinct applications.

major comments (3)
  1. [General Replacement Theorem] General Replacement Theorem: the claimed equivalence between one group-averaged snapshot and multi-snapshot covariance for subspace decomposition requires that the chosen group action commutes exactly with the signal model and preserves the relevant subspaces; the manuscript does not state the precise representation-theoretic conditions guaranteeing this commutativity, which is load-bearing for the central claim (any mismatch would cause the averaged estimator to lose the shared eigenspace).
  2. [Trivial Group Embedding Theorem] Trivial Group Embedding Theorem: the variance scaling 1/(|G|·L) and the (G,L) continuum are asserted to follow from group-order properties, yet no explicit derivation reducing these predictions to the input parameters or showing how d_eff is obtained from the representation is supplied; without this reduction the scaling claim cannot be verified independently of the Monte Carlo results.
  3. [Monte Carlo experiments] Monte Carlo experiments: confirmation of the conjecture to four-digit precision on the first four moments is reported across five group types, but the specific definitions of the groups, the precise computation of d_eff, and any data-exclusion criteria are omitted; these details are necessary to assess whether the numerical support generalizes beyond the tested cases.
minor comments (2)
  1. The abstract states that five applications are demonstrated, yet no quantitative performance metrics, baseline comparisons, or specific parameter settings for those demonstrations are summarized, making it difficult to gauge practical impact.
  2. Notation for d_eff should be introduced with an explicit formula or definition immediately after its first appearance, and its relation to |G| clarified in the statement of the conjectured theorem.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their careful and constructive review of our manuscript. Their comments highlight important aspects that require clarification and additional detail. We address each major comment point by point below and outline the revisions we will make to the manuscript.

read point-by-point responses
  1. Referee: [General Replacement Theorem] General Replacement Theorem: the claimed equivalence between one group-averaged snapshot and multi-snapshot covariance for subspace decomposition requires that the chosen group action commutes exactly with the signal model and preserves the relevant subspaces; the manuscript does not state the precise representation-theoretic conditions guaranteeing this commutativity, which is load-bearing for the central claim (any mismatch would cause the averaged estimator to lose the shared eigenspace).

    Authors: We agree that the precise representation-theoretic conditions ensuring commutativity between the group action and the signal model are essential to the validity of the General Replacement Theorem and should be stated explicitly. In the revised manuscript, we will insert a new lemma immediately preceding the theorem that formalizes these conditions: the group representation must commute with the covariance operator of the signal model and preserve the relevant eigenspaces. This addition will make the load-bearing assumptions transparent without altering the theorem statement itself. revision: yes

  2. Referee: [Trivial Group Embedding Theorem] Trivial Group Embedding Theorem: the variance scaling 1/(|G|·L) and the (G,L) continuum are asserted to follow from group-order properties, yet no explicit derivation reducing these predictions to the input parameters or showing how d_eff is obtained from the representation is supplied; without this reduction the scaling claim cannot be verified independently of the Monte Carlo results.

    Authors: We acknowledge that an explicit, self-contained derivation of the variance scaling 1/(|G|·L) and the definition of d_eff from the group representation would strengthen the Trivial Group Embedding Theorem. In the revision we will add a dedicated appendix section containing the step-by-step reduction: starting from the representation matrices, we derive the variance expression by direct computation of the second-moment tensor under the group action, showing how d_eff emerges as the dimension of the isotypic component. This derivation will be independent of the Monte Carlo results. revision: yes

  3. Referee: [Monte Carlo experiments] Monte Carlo experiments: confirmation of the conjecture to four-digit precision on the first four moments is reported across five group types, but the specific definitions of the groups, the precise computation of d_eff, and any data-exclusion criteria are omitted; these details are necessary to assess whether the numerical support generalizes beyond the tested cases.

    Authors: We will supply the missing experimental details in the revised manuscript. The five group types will be defined explicitly (cyclic group of order 8, dihedral group of order 16, symmetric group S4, quaternion group, and the trivial group), the formula for d_eff will be stated as the multiplicity-weighted sum of irreducible-representation dimensions, and the data-exclusion criterion (rejection of trials with condition number exceeding 10^6) will be documented. These additions will appear in the main experimental section together with a pointer to the supplementary code. revision: yes

Circularity Check

0 steps flagged

No load-bearing circularity; variance scaling and replacement theorem derived from group order without reduction to fitted inputs

full rationale

The derivation chain begins from representation-theoretic symmetry under group G and shows temporal averaging as the trivial-group case G={e}. The General Replacement Theorem equates single-snapshot group-averaged estimators to multi-snapshot covariance via algebraic averaging, with variance scaling 1/(|G|·L) following directly from group-order properties rather than parameter fitting. The Trivial Group Embedding Theorem and processing-gain equivalence are likewise obtained by embedding the classical sample covariance into the (G,L) continuum. Monte Carlo validation on sample moments across group types provides independent numerical confirmation to four-digit precision. No self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations appear in the stated theorems or conjecture. The minor score of 2 reflects only the possibility of non-load-bearing prior citations not required for the central claims.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard group representation theory and the assumption that data exhibits matching symmetry; no free parameters are explicitly fitted beyond the effective group order in the conjecture.

free parameters (1)
  • effective group order d_eff
    Introduced in the conjecture to govern variance for arbitrary statistics; value not specified but used to extend results.
axioms (2)
  • domain assumption Data objects possess representation-theoretic symmetry under group actions that preserve subspace structure
    Invoked to justify equivalence of group-averaged single-snapshot estimators to multi-snapshot methods.
  • standard math Standard algebraic properties of finite groups and their representations hold
    Used throughout definitions of group action, averaging, and embedding theorems.

pith-pipeline@v0.9.0 · 5527 in / 1410 out tokens · 49856 ms · 2026-05-13T18:49:28.900340+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Algebraic Diversity: Principles of a Group-Theoretic Approach to Signal Processing

    eess.SP 2026-04 unverdicted novelty 7.0

    Algebraic diversity uses matched groups of signal symmetries for group-orbit averaging to cut variance, defines structural capacity kappa as a Renyi-2 entropy analog, and enables blind group identification via Lie alg...

  2. Continuous Algebraic Diversity: Unifying Spectral, Wavelet, and Time-Frequency Analysis via Lie Group Actions

    eess.SP 2026-04 unverdicted novelty 7.0

    Algebraic diversity on Lie groups unifies spectral analysis (translation group), wavelets (affine group), time-frequency analysis (Heisenberg-Weyl group), and spherical harmonics (SO(3)), with a commutativity residual...

  3. Polynomial-Time Optimal Group Selection via the Double-Commutator Eigenvalue Problem

    cs.LG 2026-04 unverdicted novelty 7.0

    Optimal group selection for covariance matching reduces exactly to the minimum eigenvector of the double-commutator matrix, solvable in O(d²M² + d³) time.

  4. Unification of Signal Transform Theory

    eess.SP 2026-05 unverdicted novelty 6.0

    Signal transforms are unified as group representation eigenbases, with an algorithm to find the matched group from empirical covariances.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · cited by 4 Pith papers · 1 internal anchor

  1. [1]

    The Karhunen–Lo` eve transform of discrete MVL functions,

    M. A. Thornton, “The Karhunen–Lo` eve transform of discrete MVL functions,” inProc. 35th Int. Symp. Multiple-Valued Logic (ISMVL), pp. 194–199, 2005

  2. [2]

    Zur Spektraltheorie stochastischer Prozesse,

    K. Karhunen, “Zur Spektraltheorie stochastischer Prozesse,”Ann. Acad. Sci. Fennicae, AI, vol. 34, 1946

  3. [3]

    Lo` eve,Probability Theory

    M. Lo` eve,Probability Theory. Princeton, NJ: Van Nostrand, 1955

  4. [4]

    Analysis of a complex of statistical variables into principal components,

    H. Hotelling, “Analysis of a complex of statistical variables into principal components,”J. Educ. Psy- chol., vol. 24, no. 6, pp. 417–441, 1933

  5. [5]

    Multiple emitter location and sig- nal parameter estimation,

    R. Schmidt, “Multiple emitter location and sig- nal parameter estimation,”IEEE Trans. Antennas Propag., vol. 34, no. 3, pp. 276–280, 1986

  6. [6]

    On spatial smoothing for direction-of-arrival estimation of co- herent signals,

    T. J. Shan, M. Wax, and T. Kailath, “On spatial smoothing for direction-of-arrival estimation of co- herent signals,”IEEE Trans. Acoust., Speech, Signal Process., vol. 33, no. 4, pp. 806–811, 1985

  7. [7]

    MUSIC for single- snapshot spectral estimation: stability and super- resolution,

    W. Liao and A. Fannjiang, “MUSIC for single- snapshot spectral estimation: stability and super- resolution,”Appl. Comput. Harmonic Anal., vol. 40, no. 1, pp. 33–67, 2016

  8. [8]

    Simplification of MUSIC and ES- PRIT by exploitation of cyclostationarity,

    W. A. Gardner, “Simplification of MUSIC and ES- PRIT by exploitation of cyclostationarity,”Proc. IEEE, vol. 76, no. 7, pp. 845–847, 1988

  9. [9]

    Vergleichende Betrachtungen ¨ uber neuere geometrische Forschungen,

    F. Klein, “Vergleichende Betrachtungen ¨ uber neuere geometrische Forschungen,”Mathematische An- nalen, vol. 43, pp. 63–100, 1872

  10. [10]

    Fast generalized Fourier transforms,

    M. Clausen, “Fast generalized Fourier transforms,” Theoret. Comput. Sci., vol. 67, no. 1, pp. 55–63, 1989

  11. [11]

    Gallian,Contemporary Abstract Algebra

    J. Gallian,Contemporary Abstract Algebra. Chap- man and Hall/CRC, 2021

  12. [12]

    Spectral analysis of Boolean functions as a graph eigenvalue problem,

    A. Bernasconi and B. Codenotti, “Spectral analysis of Boolean functions as a graph eigenvalue problem,” IEEE Trans. Comput., vol. 48, no. 3, pp. 345–351, 1999

  13. [13]

    Grenander and G

    U. Grenander and G. Szeg˝ o,Toeplitz Forms and Their Applications. Berkeley, CA: Univ. California Press, 1958

  14. [14]

    How close is the sample covariance matrix to the actual one?

    R. Vershynin, “How close is the sample covariance matrix to the actual one?”Adv. Math., vol. 231, no. 6, pp. 3038–3068, 2012

  15. [15]

    Algebraic signal processing theory: Foundation and 1-D time,

    M. P¨ uschel and J. M. F. Moura, “Algebraic signal processing theory: Foundation and 1-D time,”IEEE Trans. Signal Process., vol. 56, no. 8, pp. 3572–3585, Aug. 2008

  16. [16]

    Algebraic signal processing theory: 1-D space,

    M. P¨ uschel and J. M. F. Moura, “Algebraic signal processing theory: 1-D space,”IEEE Trans. Signal Process., vol. 56, no. 8, pp. 3586–3599, Aug. 2008

  17. [17]

    Algebraic signal processing theory: Cooley–Tukey type algorithms for DCTs and DSTs,

    M. P¨ uschel and J. M. F. Moura, “Algebraic signal processing theory: Cooley–Tukey type algorithms for DCTs and DSTs,”IEEE Trans. Signal Process., vol. 56, no. 4, pp. 1502–1521, Apr. 2008

  18. [18]

    Nested arrays: A novel approach to array processing with enhanced degrees of freedom,

    P. Pal and P. P. Vaidyanathan, “Nested arrays: A novel approach to array processing with enhanced degrees of freedom,”IEEE Trans. Signal Process., vol. 58, no. 8, pp. 4167–4181, Aug. 2010. 40

  19. [19]

    Sparse sensing with co-prime samplers and arrays,

    P. P. Vaidyanathan and P. Pal, “Sparse sensing with co-prime samplers and arrays,”IEEE Trans. Signal Process., vol. 59, no. 2, pp. 573–586, Feb. 2011

  20. [20]

    Compressive covariance sensing: Structure-based compressive sensing beyond sparsity,

    D. Romero, D. D. Ariananda, Z. Tian, and G. Leus, “Compressive covariance sensing: Structure-based compressive sensing beyond sparsity,”IEEE Signal Process. Mag., vol. 33, no. 1, pp. 78–93, Jan. 2016

  21. [21]

    Cross-sections of orbits and their application to densities of maximal invariants,

    R. A. Wijsman, “Cross-sections of orbits and their application to densities of maximal invariants,” in Proc. 5th Berkeley Symp. Math. Statist. Probab., vol. 1, pp. 389–400, 1967

  22. [22]

    M. L. Eaton,Group Invariance Applications in Statistics. Hayward, CA: Inst. Math. Statist., 1989

  23. [23]

    Generation of permutations by adja- cent transposition,

    S. M. Johnson, “Generation of permutations by adja- cent transposition,”Math. Comput., vol. 17, no. 83, pp. 282–285, 1963

  24. [24]

    Teaching combinatorial tricks to a computer,

    D. H. Lehmer, “Teaching combinatorial tricks to a computer,” inProc. Symp. Appl. Math., vol. 10, pp. 179–193, 1960

  25. [25]

    Permutations by interchanges,

    B. R. Heap, “Permutations by interchanges,”Com- puter J., vol. 6, no. 3, pp. 293–298, 1963

  26. [26]

    Self-recovering equalization and car- rier tracking in two-dimensional data communication systems,

    D. N. Godard, “Self-recovering equalization and car- rier tracking in two-dimensional data communication systems,”IEEE Trans. Commun., vol. 28, no. 11, pp. 1867–1875, 1980

  27. [27]

    A new approach to multipath correction of constant modulus sig- nals,

    J. R. Treichler and B. G. Agee, “A new approach to multipath correction of constant modulus sig- nals,”IEEE Trans. Acoust., Speech, Signal Process., vol. 31, no. 2, pp. 459–472, 1983

  28. [28]

    New criteria for blind deconvolution of nonminimum phase systems (chan- nels),

    O. Shalvi and E. Weinstein, “New criteria for blind deconvolution of nonminimum phase systems (chan- nels),”IEEE Trans. Inform. Theory, vol. 36, no. 2, pp. 312–321, 1990

  29. [29]

    Study on channel model for frequencies from 0.5 to 100 GHz,

    3GPP, “Study on channel model for frequencies from 0.5 to 100 GHz,” 3GPP TR 38.901, v16.1.0, Dec. 2019

  30. [30]

    A linear filtering approach to the computation of discrete Fourier transform,

    L. I. Bluestein, “A linear filtering approach to the computation of discrete Fourier transform,”IEEE Trans. Audio Electroacoust., vol. 18, no. 4, pp. 451– 455, Dec. 1970

  31. [31]

    Digital computation of the fractional Fourier transform,

    H. M. Ozaktas, O. Arikan, M. A. Kutay, and G. Bozdagi, “Digital computation of the fractional Fourier transform,”IEEE Trans. Signal Process., vol. 44, no. 9, pp. 2141–2150, Sep. 1996

  32. [32]

    A foundation of information geome- try,

    S. Amari, “A foundation of information geome- try,”Electron. Commun. Japan (Part I: Commun.), vol. 66, no. 6, pp. 1–10, 1983

  33. [33]

    A mathematical theory of commu- nication,

    C. E. Shannon, “A mathematical theory of commu- nication,”Bell Syst. Tech. J., vol. 27, no. 3, pp. 379– 423, July 1948

  34. [34]

    Information geome- try and its applications: An overview,

    F. Critchley and P. Marriott, “Information geome- try and its applications: An overview,” inComputa- tional Information Geometry: For Image and Signal Processing. Cham: Springer, 2016, pp. 1–31

  35. [35]

    Distance preservation,

    J. A. Lee and M. Verleysen, “Distance preservation,” inNonlinear Dimensionality Reduction(Information Science and Statistics). New York, NY: Springer, 2007, ch. 4

  36. [36]

    RoFormer: Enhanced transformer with ro- tary position embedding,

    J. Su, M. H. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu, “RoFormer: Enhanced transformer with ro- tary position embedding,”Neurocomputing, vol. 568, p. 127063, 2024

  37. [37]

    Attention is all you need,

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdv. Neural Inform. Process. Syst. (NeurIPS), vol. 30, 2017

  38. [38]

    A mathematical framework for trans- former circuits,

    N. Elhage, N. Nanda, C. Olsson, T. Henighan, N. Joseph, B. Mann, A. Askell, Y. Bai, A. Chen, T. Conerly, N. DasSarma, D. Drain, D. Gan- guli, Z. Hatfield-Dodds, D. Hernandez, A. Jones, J. Kernion, L. Lovitt, K. Ndousse, D. Amodei, T. Brown, J. Clark, J. Kaplan, S. McCandlish, and C. Olah, “A mathematical framework for trans- former circuits,”Transformer C...

  39. [39]

    Are sixteen heads really better than one?

    P. Michel, O. Levy, and G. Neubig, “Are sixteen heads really better than one?” inAdv. Neural In- form. Process. Syst. (NeurIPS), vol. 32, 2019

  40. [40]

    KIVI: A tuning- free asymmetric 2-bit quantization for KV cache,

    Z. Liu, A. Desai, F. Liao, W. Wang, V. Xie, Z. Xu, A. Kyrillidis, and A. Shrivastava, “KIVI: A tuning- free asymmetric 2-bit quantization for KV cache,” in Proc. Int. Conf. Machine Learning (ICML), 2024

  41. [41]

    Algebraic Diversity: Principles of a Group-Theoretic Approach to Signal Processing

    M. A. Thornton, “Algebraic Diversity: Principles of a Group-Theoretic Approach to Signal Processing,” arXiv:2604.19983 [eess.SP], April 2026. 41