arxiv: 2604.03634 · v4 · submitted 2026-04-04 · 💻 cs.LG · cs.IT· eess.SP· math.IT

Recognition: 2 theorem links

· Lean Theorem

Algebraic Diversity: Group-Theoretic Spectral Estimation from Single Observations

Mitchell A. Thornton

Pith reviewed 2026-05-13 18:49 UTC · model grok-4.3

classification 💻 cs.LG cs.ITeess.SPmath.IT

keywords group-theoretic estimationsingle-snapshot averagingsubspace decompositionalgebraic averagingrepresentation symmetryprocessing gainspectral estimationreplacement theorem

0 comments

The pith

Group averaging applied to a single observation can match the subspace estimation of traditional multi-observation covariance methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that temporal averaging over many independent observations is merely the special case of algebraic group action when the group is trivial. By applying a chosen group of transformations to one snapshot, an estimator achieves the same subspace decomposition that would otherwise require multiple snapshots. A sympathetic reader would care because this reframes data collection requirements in signal processing: performance gains derive from the order of the symmetry group rather than the number of separate measurements. The approach unifies classical transforms such as the DFT as instances of matching the data's underlying group structure and extends conjecturally to arbitrary statistics.

Core claim

Temporal averaging over multiple observations is the degenerate case of algebraic group action with the trivial group G={e}. A General Replacement Theorem proves that a group-averaged estimator from one snapshot achieves equivalent subspace decomposition to multi-snapshot covariance estimation. The Trivial Group Embedding Theorem shows that the sample covariance is the accumulation of trivial-group estimates whose variance is governed by a (G,L) continuum as 1/(|G|·L). Processing gain equals classical beamforming gain and is therefore a property of group order. The DFT, DCT, and KLT arise as group-matched special cases. Monte Carlo experiments on the first four sample moments across five 그룹s

What carries the argument

The General Replacement Theorem, which equates a single-snapshot group-averaged estimator to the subspace decomposition obtained from multi-snapshot covariance estimation by leveraging representation-theoretic symmetry.

If this is right

Processing gain scales with group order |G| and equals 10 log10(|G|) dB independent of sensor count.
The DFT, DCT, and KLT are recovered as special cases of group-matched estimators.
Sample covariance arises exactly as the limit of trivial-group accumulations.
Single-snapshot methods become viable for MUSIC, massive MIMO, waveform classification, graph signals, and transformer analysis.
Variance reduction for arbitrary statistics is conjectured to follow the effective group order d_eff.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework may support real-time estimation on non-stationary signals where repeated snapshots cannot be collected.
Automatic or blind selection of the group G could yield adaptive algorithms for data whose symmetry is unknown in advance.
In learning systems, explicit symmetry groups might reduce the number of training examples needed by exploiting structural regularities rather than statistical volume.

Load-bearing premise

The data object possesses representation-theoretic symmetry under the chosen group such that the group action preserves the relevant subspace structure.

What would settle it

Controlled experiments in which the group-averaged single-snapshot subspace estimate deviates from the multi-snapshot covariance estimate by more than the predicted variance scaling 1/(|G|·L), or Monte Carlo runs on sample moments that fail to match the conjectured d_eff scaling to four-digit precision.

Figures

Figures reproduced from arXiv: 2604.03634 by Mitchell A. Thornton.

**Figure 2.** Figure 2: Massive MIMO: AD vs. MMSE at SNR = 15 dB, K = 4 users. (a) Effective throughput vs. M for three CDL channel models. Dashed: MMSE; solid: AD. (b) Percentage gain of AD over MMSE. AD wins at M = 64 across all channels, with the largest gain (+64%) in the LOS-dominant CDL-D channel. 2) LOS channels favor AD. CDL-D (LOSdominant, narrow angular spread) is AD’s strongest regime: the channel’s spatial structure … view at source ↗

**Figure 3.** Figure 3: Single-pulse chirp characterization via the [PITH_FULL_IMAGE:figures/full_fig_p024_3.png] view at source ↗

**Figure 4.** Figure 4: SNR robustness of the chirp-adapted group [PITH_FULL_IMAGE:figures/full_fig_p024_4.png] view at source ↗

**Figure 5.** Figure 5: Four-class single-pulse waveform classification [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗

**Figure 7.** Figure 7: SNR threshold comparison for single-pulse clas [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗

**Figure 8.** Figure 8: Non-stationary modulated source scenario [PITH_FULL_IMAGE:figures/full_fig_p027_8.png] view at source ↗

**Figure 9.** Figure 9: Chirp characterization against a non-stationary [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗

**Figure 10.** Figure 10: Commutativity residual δ for five generic permutation generators tested against graph-diffusion covariance on six graphs. Blue bars: generators that are graph automorphisms (δ = 0). Red bars: generators that are not automorphisms (δ > 0). In all cases, δ separates automorphisms from non-automorphisms, and the generator with minimum δ is an automorphism (Theorem 25). Proposition 26 (Permutation Commutat… view at source ↗

**Figure 12.** Figure 12: Three candidate graphs from the systematic filtering pipeline ( [PITH_FULL_IMAGE:figures/full_fig_p032_12.png] view at source ↗

**Figure 13.** Figure 13: Three strongest candidate graphs from the [PITH_FULL_IMAGE:figures/full_fig_p032_13.png] view at source ↗

**Figure 14.** Figure 14: Deep analysis of graph C5 (K4 clique + pendant, Aut ∼= S3). (a) Conjugation stress test: S3 advantage persists at +17% even with 500 random conjugation candidates; the Laplacian eigenvector conjugation performs poorly (ψ = 0.35). (b) SNR sweep: the advantage emerges above 0 dB, stabilizes at +21% for SNR ≥ 15 dB, and persists at 30 dB (structural, not noise artifact). (c) Eigenvalue anatomy: S3 concen… view at source ↗

**Figure 18.** Figure 18: Single-snapshot AD spectrum (red) vs. L = 100 averaged spectrum (blue) for M = 64 with three embedded signals. The true KL spectrum is shown in gray. AD from one observation achieves spectral resolution comparable to 100-snapshot averaging. 5 0 5 10 15 20 25 30 Input SNR (dB) 0.00 0.02 0.04 0.06 0.08 0.10 0.12 (dim e n sio nle s s) Commutativity residual 5 0 5 10 15 20 25 30 Input SNR (dB) 0.32 0.34 0.36… view at source ↗

**Figure 19.** Figure 19: Commutativity residual δ (left) and absolute mismatch ˜δ (right) vs. SNR for G = Z8, M = 8. The scale-invariant δ is flat; the energy-weighted ˜δ grows with SNR. at the receiver array. Current 3GPP approaches handle this with interference rejection combining (IRC), which is a form of pre-whitening. The group-theoretic noise characterization provides a structured alternative: if the interference has regula… view at source ↗

read the original abstract

We establish that temporal averaging over multiple observations is the degenerate case of algebraic group action with the trivial group $G=\{e\}$. A General Replacement Theorem proves that a group-averaged estimator from one snapshot achieves equivalent subspace decomposition to multi-snapshot covariance estimation. The Trivial Group Embedding Theorem proves that the sample covariance is the accumulation of trivial-group estimates, with variance governed by a $(G,L)$ continuum as $1/(|G|\cdot L)$. The processing gain $10\log_{10}(M)$ dB equals the classical beamforming gain, establishing that this gain is a property of group order, not sensor count. The DFT, DCT, and KLT are unified as group-matched special cases. We conjecture a General Algebraic Averaging Theorem extending these results to arbitrary statistics, with variance governed by the effective group order $d_{\mathrm{eff}}$. Monte Carlo experiments on the first four sample moments across five group types confirm the conjecture to four-digit precision. The framework exploits the $structure$ of information (representation-theoretic symmetry of the data object) rather than the content, complementing Shannon's theory. Five applications are demonstrated: single-snapshot MUSIC, massive MIMO, single-pulse waveform classification, graph signal processing, and analysis of transformer LLMs. Techniques for blind group matching are described.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's replacement of temporal averaging with group action on a single snapshot is a neat unification trick, but the equivalence hinges on symmetry assumptions that look hard to meet in practice without extra knowledge.

read the letter

The main new pieces are the General Replacement Theorem, which says a single group-averaged snapshot can match the subspace decomposition from multi-snapshot covariance, and the Trivial Group Embedding Theorem that frames sample covariance as the trivial-group case with variance scaling as 1 over group order times L. They also conjecture a broader algebraic averaging result and back the moment behavior with Monte Carlo runs to four digits across five group types. The unification of DFT, DCT, and KLT as group-matched cases is clean, and tying the processing gain directly to group order rather than sensor count is a useful reframing. The listed applications in single-snapshot MUSIC, massive MIMO, and graph signals show they are thinking about real use cases. The Monte Carlo work on moments is a positive step toward verification. The soft spot is the core assumption that the data object carries exact representation-theoretic symmetry under the chosen group G, so that the group action preserves the relevant subspaces without prior knowledge of that symmetry. The stress-test note is right on this: any mismatch between the true structure and the assumed G would break the claimed equivalence, and the abstract gives no explicit conditions on the representation or noise model that guarantee commutativity. Blind group matching is mentioned but not detailed enough to judge how robust it is. The paper is aimed at signal-processing researchers who already work with array or graph data and want lower data requirements. It deserves a serious referee because the theorems are stated clearly enough to check and the Monte Carlo provides some empirical anchor, even if the proofs and application details will need tightening.

Referee Report

3 major / 2 minor

Summary. The manuscript claims that temporal averaging over multiple observations is a special case of algebraic averaging under the trivial group G={e}. It introduces a General Replacement Theorem asserting that a single group-averaged snapshot yields equivalent subspace decomposition to multi-snapshot covariance estimation, a Trivial Group Embedding Theorem showing that the sample covariance accumulates trivial-group estimates with variance scaling as 1/(|G|·L), and unifies the DFT, DCT, and KLT as group-matched cases. A General Algebraic Averaging Theorem is conjectured for arbitrary statistics with variance governed by effective group order d_eff, supported by Monte Carlo confirmation of the first four sample moments to four-digit precision across five group types. The framework is applied to single-snapshot MUSIC, massive MIMO, waveform classification, graph signal processing, and transformer LLMs, emphasizing representation-theoretic symmetry over data content.

Significance. If the Replacement and Embedding Theorems hold under the stated conditions, the work provides a unifying algebraic perspective on spectral estimation that recasts classical processing gain as a function of group order rather than sensor count, with potential to enable single-observation methods in data-scarce regimes. The Monte Carlo validation to four-digit precision on moments and the explicit unification of DFT/DCT/KLT constitute concrete, reproducible support for the conjecture. The approach complements Shannon theory by exploiting algebraic structure and is demonstrated across five distinct applications.

major comments (3)

[General Replacement Theorem] General Replacement Theorem: the claimed equivalence between one group-averaged snapshot and multi-snapshot covariance for subspace decomposition requires that the chosen group action commutes exactly with the signal model and preserves the relevant subspaces; the manuscript does not state the precise representation-theoretic conditions guaranteeing this commutativity, which is load-bearing for the central claim (any mismatch would cause the averaged estimator to lose the shared eigenspace).
[Trivial Group Embedding Theorem] Trivial Group Embedding Theorem: the variance scaling 1/(|G|·L) and the (G,L) continuum are asserted to follow from group-order properties, yet no explicit derivation reducing these predictions to the input parameters or showing how d_eff is obtained from the representation is supplied; without this reduction the scaling claim cannot be verified independently of the Monte Carlo results.
[Monte Carlo experiments] Monte Carlo experiments: confirmation of the conjecture to four-digit precision on the first four moments is reported across five group types, but the specific definitions of the groups, the precise computation of d_eff, and any data-exclusion criteria are omitted; these details are necessary to assess whether the numerical support generalizes beyond the tested cases.

minor comments (2)

The abstract states that five applications are demonstrated, yet no quantitative performance metrics, baseline comparisons, or specific parameter settings for those demonstrations are summarized, making it difficult to gauge practical impact.
Notation for d_eff should be introduced with an explicit formula or definition immediately after its first appearance, and its relation to |G| clarified in the statement of the conjectured theorem.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their careful and constructive review of our manuscript. Their comments highlight important aspects that require clarification and additional detail. We address each major comment point by point below and outline the revisions we will make to the manuscript.

read point-by-point responses

Referee: [General Replacement Theorem] General Replacement Theorem: the claimed equivalence between one group-averaged snapshot and multi-snapshot covariance for subspace decomposition requires that the chosen group action commutes exactly with the signal model and preserves the relevant subspaces; the manuscript does not state the precise representation-theoretic conditions guaranteeing this commutativity, which is load-bearing for the central claim (any mismatch would cause the averaged estimator to lose the shared eigenspace).

Authors: We agree that the precise representation-theoretic conditions ensuring commutativity between the group action and the signal model are essential to the validity of the General Replacement Theorem and should be stated explicitly. In the revised manuscript, we will insert a new lemma immediately preceding the theorem that formalizes these conditions: the group representation must commute with the covariance operator of the signal model and preserve the relevant eigenspaces. This addition will make the load-bearing assumptions transparent without altering the theorem statement itself. revision: yes
Referee: [Trivial Group Embedding Theorem] Trivial Group Embedding Theorem: the variance scaling 1/(|G|·L) and the (G,L) continuum are asserted to follow from group-order properties, yet no explicit derivation reducing these predictions to the input parameters or showing how d_eff is obtained from the representation is supplied; without this reduction the scaling claim cannot be verified independently of the Monte Carlo results.

Authors: We acknowledge that an explicit, self-contained derivation of the variance scaling 1/(|G|·L) and the definition of d_eff from the group representation would strengthen the Trivial Group Embedding Theorem. In the revision we will add a dedicated appendix section containing the step-by-step reduction: starting from the representation matrices, we derive the variance expression by direct computation of the second-moment tensor under the group action, showing how d_eff emerges as the dimension of the isotypic component. This derivation will be independent of the Monte Carlo results. revision: yes
Referee: [Monte Carlo experiments] Monte Carlo experiments: confirmation of the conjecture to four-digit precision on the first four moments is reported across five group types, but the specific definitions of the groups, the precise computation of d_eff, and any data-exclusion criteria are omitted; these details are necessary to assess whether the numerical support generalizes beyond the tested cases.

Authors: We will supply the missing experimental details in the revised manuscript. The five group types will be defined explicitly (cyclic group of order 8, dihedral group of order 16, symmetric group S4, quaternion group, and the trivial group), the formula for d_eff will be stated as the multiplicity-weighted sum of irreducible-representation dimensions, and the data-exclusion criterion (rejection of trials with condition number exceeding 10^6) will be documented. These additions will appear in the main experimental section together with a pointer to the supplementary code. revision: yes

Circularity Check

0 steps flagged

No load-bearing circularity; variance scaling and replacement theorem derived from group order without reduction to fitted inputs

full rationale

The derivation chain begins from representation-theoretic symmetry under group G and shows temporal averaging as the trivial-group case G={e}. The General Replacement Theorem equates single-snapshot group-averaged estimators to multi-snapshot covariance via algebraic averaging, with variance scaling 1/(|G|·L) following directly from group-order properties rather than parameter fitting. The Trivial Group Embedding Theorem and processing-gain equivalence are likewise obtained by embedding the classical sample covariance into the (G,L) continuum. Monte Carlo validation on sample moments across group types provides independent numerical confirmation to four-digit precision. No self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations appear in the stated theorems or conjecture. The minor score of 2 reflects only the possibility of non-load-bearing prior citations not required for the central claims.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard group representation theory and the assumption that data exhibits matching symmetry; no free parameters are explicitly fitted beyond the effective group order in the conjecture.

free parameters (1)

effective group order d_eff
Introduced in the conjecture to govern variance for arbitrary statistics; value not specified but used to extend results.

axioms (2)

domain assumption Data objects possess representation-theoretic symmetry under group actions that preserve subspace structure
Invoked to justify equivalence of group-averaged single-snapshot estimators to multi-snapshot methods.
standard math Standard algebraic properties of finite groups and their representations hold
Used throughout definitions of group action, averaging, and embedding theorems.

pith-pipeline@v0.9.0 · 5527 in / 1410 out tokens · 49856 ms · 2026-05-13T18:49:28.900340+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear
General Replacement Theorem proves that a group-averaged estimator from one snapshot achieves equivalent subspace decomposition to multi-snapshot covariance estimation... temporal averaging over multiple observations is the degenerate case of algebraic group action with the trivial group G={e}
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear
Optimality Theorem demonstrating that the symmetric group SM is universally optimal... Cayley graph spectral decomposition yields the Karhunen–Loève (KL) transform

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Algebraic Diversity: Principles of a Group-Theoretic Approach to Signal Processing
eess.SP 2026-04 unverdicted novelty 7.0

Algebraic diversity uses matched groups of signal symmetries for group-orbit averaging to cut variance, defines structural capacity kappa as a Renyi-2 entropy analog, and enables blind group identification via Lie alg...
Continuous Algebraic Diversity: Unifying Spectral, Wavelet, and Time-Frequency Analysis via Lie Group Actions
eess.SP 2026-04 unverdicted novelty 7.0

Algebraic diversity on Lie groups unifies spectral analysis (translation group), wavelets (affine group), time-frequency analysis (Heisenberg-Weyl group), and spherical harmonics (SO(3)), with a commutativity residual...
Polynomial-Time Optimal Group Selection via the Double-Commutator Eigenvalue Problem
cs.LG 2026-04 unverdicted novelty 7.0

Optimal group selection for covariance matching reduces exactly to the minimum eigenvector of the double-commutator matrix, solvable in O(d²M² + d³) time.
Unification of Signal Transform Theory
eess.SP 2026-05 unverdicted novelty 6.0

Signal transforms are unified as group representation eigenbases, with an algorithm to find the matched group from empirical covariances.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · cited by 4 Pith papers · 1 internal anchor

[1]

The Karhunen–Lo` eve transform of discrete MVL functions,

M. A. Thornton, “The Karhunen–Lo` eve transform of discrete MVL functions,” inProc. 35th Int. Symp. Multiple-Valued Logic (ISMVL), pp. 194–199, 2005

work page 2005
[2]

Zur Spektraltheorie stochastischer Prozesse,

K. Karhunen, “Zur Spektraltheorie stochastischer Prozesse,”Ann. Acad. Sci. Fennicae, AI, vol. 34, 1946

work page 1946
[3]

Lo` eve,Probability Theory

M. Lo` eve,Probability Theory. Princeton, NJ: Van Nostrand, 1955

work page 1955
[4]

Analysis of a complex of statistical variables into principal components,

H. Hotelling, “Analysis of a complex of statistical variables into principal components,”J. Educ. Psy- chol., vol. 24, no. 6, pp. 417–441, 1933

work page 1933
[5]

Multiple emitter location and sig- nal parameter estimation,

R. Schmidt, “Multiple emitter location and sig- nal parameter estimation,”IEEE Trans. Antennas Propag., vol. 34, no. 3, pp. 276–280, 1986

work page 1986
[6]

On spatial smoothing for direction-of-arrival estimation of co- herent signals,

T. J. Shan, M. Wax, and T. Kailath, “On spatial smoothing for direction-of-arrival estimation of co- herent signals,”IEEE Trans. Acoust., Speech, Signal Process., vol. 33, no. 4, pp. 806–811, 1985

work page 1985
[7]

MUSIC for single- snapshot spectral estimation: stability and super- resolution,

W. Liao and A. Fannjiang, “MUSIC for single- snapshot spectral estimation: stability and super- resolution,”Appl. Comput. Harmonic Anal., vol. 40, no. 1, pp. 33–67, 2016

work page 2016
[8]

Simplification of MUSIC and ES- PRIT by exploitation of cyclostationarity,

W. A. Gardner, “Simplification of MUSIC and ES- PRIT by exploitation of cyclostationarity,”Proc. IEEE, vol. 76, no. 7, pp. 845–847, 1988

work page 1988
[9]

Vergleichende Betrachtungen ¨ uber neuere geometrische Forschungen,

F. Klein, “Vergleichende Betrachtungen ¨ uber neuere geometrische Forschungen,”Mathematische An- nalen, vol. 43, pp. 63–100, 1872

work page
[10]

Fast generalized Fourier transforms,

M. Clausen, “Fast generalized Fourier transforms,” Theoret. Comput. Sci., vol. 67, no. 1, pp. 55–63, 1989

work page 1989
[11]

Gallian,Contemporary Abstract Algebra

J. Gallian,Contemporary Abstract Algebra. Chap- man and Hall/CRC, 2021

work page 2021
[12]

Spectral analysis of Boolean functions as a graph eigenvalue problem,

A. Bernasconi and B. Codenotti, “Spectral analysis of Boolean functions as a graph eigenvalue problem,” IEEE Trans. Comput., vol. 48, no. 3, pp. 345–351, 1999

work page 1999
[13]

Grenander and G

U. Grenander and G. Szeg˝ o,Toeplitz Forms and Their Applications. Berkeley, CA: Univ. California Press, 1958

work page 1958
[14]

How close is the sample covariance matrix to the actual one?

R. Vershynin, “How close is the sample covariance matrix to the actual one?”Adv. Math., vol. 231, no. 6, pp. 3038–3068, 2012

work page 2012
[15]

Algebraic signal processing theory: Foundation and 1-D time,

M. P¨ uschel and J. M. F. Moura, “Algebraic signal processing theory: Foundation and 1-D time,”IEEE Trans. Signal Process., vol. 56, no. 8, pp. 3572–3585, Aug. 2008

work page 2008
[16]

Algebraic signal processing theory: 1-D space,

M. P¨ uschel and J. M. F. Moura, “Algebraic signal processing theory: 1-D space,”IEEE Trans. Signal Process., vol. 56, no. 8, pp. 3586–3599, Aug. 2008

work page 2008
[17]

Algebraic signal processing theory: Cooley–Tukey type algorithms for DCTs and DSTs,

M. P¨ uschel and J. M. F. Moura, “Algebraic signal processing theory: Cooley–Tukey type algorithms for DCTs and DSTs,”IEEE Trans. Signal Process., vol. 56, no. 4, pp. 1502–1521, Apr. 2008

work page 2008
[18]

Nested arrays: A novel approach to array processing with enhanced degrees of freedom,

P. Pal and P. P. Vaidyanathan, “Nested arrays: A novel approach to array processing with enhanced degrees of freedom,”IEEE Trans. Signal Process., vol. 58, no. 8, pp. 4167–4181, Aug. 2010. 40

work page 2010
[19]

Sparse sensing with co-prime samplers and arrays,

P. P. Vaidyanathan and P. Pal, “Sparse sensing with co-prime samplers and arrays,”IEEE Trans. Signal Process., vol. 59, no. 2, pp. 573–586, Feb. 2011

work page 2011
[20]

Compressive covariance sensing: Structure-based compressive sensing beyond sparsity,

D. Romero, D. D. Ariananda, Z. Tian, and G. Leus, “Compressive covariance sensing: Structure-based compressive sensing beyond sparsity,”IEEE Signal Process. Mag., vol. 33, no. 1, pp. 78–93, Jan. 2016

work page 2016
[21]

Cross-sections of orbits and their application to densities of maximal invariants,

R. A. Wijsman, “Cross-sections of orbits and their application to densities of maximal invariants,” in Proc. 5th Berkeley Symp. Math. Statist. Probab., vol. 1, pp. 389–400, 1967

work page 1967
[22]

M. L. Eaton,Group Invariance Applications in Statistics. Hayward, CA: Inst. Math. Statist., 1989

work page 1989
[23]

Generation of permutations by adja- cent transposition,

S. M. Johnson, “Generation of permutations by adja- cent transposition,”Math. Comput., vol. 17, no. 83, pp. 282–285, 1963

work page 1963
[24]

Teaching combinatorial tricks to a computer,

D. H. Lehmer, “Teaching combinatorial tricks to a computer,” inProc. Symp. Appl. Math., vol. 10, pp. 179–193, 1960

work page 1960
[25]

Permutations by interchanges,

B. R. Heap, “Permutations by interchanges,”Com- puter J., vol. 6, no. 3, pp. 293–298, 1963

work page 1963
[26]

Self-recovering equalization and car- rier tracking in two-dimensional data communication systems,

D. N. Godard, “Self-recovering equalization and car- rier tracking in two-dimensional data communication systems,”IEEE Trans. Commun., vol. 28, no. 11, pp. 1867–1875, 1980

work page 1980
[27]

A new approach to multipath correction of constant modulus sig- nals,

J. R. Treichler and B. G. Agee, “A new approach to multipath correction of constant modulus sig- nals,”IEEE Trans. Acoust., Speech, Signal Process., vol. 31, no. 2, pp. 459–472, 1983

work page 1983
[28]

New criteria for blind deconvolution of nonminimum phase systems (chan- nels),

O. Shalvi and E. Weinstein, “New criteria for blind deconvolution of nonminimum phase systems (chan- nels),”IEEE Trans. Inform. Theory, vol. 36, no. 2, pp. 312–321, 1990

work page 1990
[29]

Study on channel model for frequencies from 0.5 to 100 GHz,

3GPP, “Study on channel model for frequencies from 0.5 to 100 GHz,” 3GPP TR 38.901, v16.1.0, Dec. 2019

work page 2019
[30]

A linear filtering approach to the computation of discrete Fourier transform,

L. I. Bluestein, “A linear filtering approach to the computation of discrete Fourier transform,”IEEE Trans. Audio Electroacoust., vol. 18, no. 4, pp. 451– 455, Dec. 1970

work page 1970
[31]

Digital computation of the fractional Fourier transform,

H. M. Ozaktas, O. Arikan, M. A. Kutay, and G. Bozdagi, “Digital computation of the fractional Fourier transform,”IEEE Trans. Signal Process., vol. 44, no. 9, pp. 2141–2150, Sep. 1996

work page 1996
[32]

A foundation of information geome- try,

S. Amari, “A foundation of information geome- try,”Electron. Commun. Japan (Part I: Commun.), vol. 66, no. 6, pp. 1–10, 1983

work page 1983
[33]

A mathematical theory of commu- nication,

C. E. Shannon, “A mathematical theory of commu- nication,”Bell Syst. Tech. J., vol. 27, no. 3, pp. 379– 423, July 1948

work page 1948
[34]

Information geome- try and its applications: An overview,

F. Critchley and P. Marriott, “Information geome- try and its applications: An overview,” inComputa- tional Information Geometry: For Image and Signal Processing. Cham: Springer, 2016, pp. 1–31

work page 2016
[35]

Distance preservation,

J. A. Lee and M. Verleysen, “Distance preservation,” inNonlinear Dimensionality Reduction(Information Science and Statistics). New York, NY: Springer, 2007, ch. 4

work page 2007
[36]

RoFormer: Enhanced transformer with ro- tary position embedding,

J. Su, M. H. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu, “RoFormer: Enhanced transformer with ro- tary position embedding,”Neurocomputing, vol. 568, p. 127063, 2024

work page 2024
[37]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdv. Neural Inform. Process. Syst. (NeurIPS), vol. 30, 2017

work page 2017
[38]

A mathematical framework for trans- former circuits,

N. Elhage, N. Nanda, C. Olsson, T. Henighan, N. Joseph, B. Mann, A. Askell, Y. Bai, A. Chen, T. Conerly, N. DasSarma, D. Drain, D. Gan- guli, Z. Hatfield-Dodds, D. Hernandez, A. Jones, J. Kernion, L. Lovitt, K. Ndousse, D. Amodei, T. Brown, J. Clark, J. Kaplan, S. McCandlish, and C. Olah, “A mathematical framework for trans- former circuits,”Transformer C...

work page 2021
[39]

Are sixteen heads really better than one?

P. Michel, O. Levy, and G. Neubig, “Are sixteen heads really better than one?” inAdv. Neural In- form. Process. Syst. (NeurIPS), vol. 32, 2019

work page 2019
[40]

KIVI: A tuning- free asymmetric 2-bit quantization for KV cache,

Z. Liu, A. Desai, F. Liao, W. Wang, V. Xie, Z. Xu, A. Kyrillidis, and A. Shrivastava, “KIVI: A tuning- free asymmetric 2-bit quantization for KV cache,” in Proc. Int. Conf. Machine Learning (ICML), 2024

work page 2024
[41]

Algebraic Diversity: Principles of a Group-Theoretic Approach to Signal Processing

M. A. Thornton, “Algebraic Diversity: Principles of a Group-Theoretic Approach to Signal Processing,” arXiv:2604.19983 [eess.SP], April 2026. 41

work page internal anchor Pith review Pith/arXiv arXiv 2026