arxiv: 2605.08505 · v1 · submitted 2026-05-08 · 💻 cs.LG · cs.AI· math.PR· math.ST· stat.TH

Recognition: no theorem link

Scaling Limits of Long-Context Transformers

Giuseppe Bruno , Shi Chen , Zhengjiang Lin , Yury Polyanskiy , Philippe Rigollet

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:41 UTC · model grok-4.3

classification 💻 cs.LG cs.AImath.PRmath.STstat.TH

keywords long-context transformerssoftmax self-attentionscaling limitsattention collapseinverse temperaturecritical regimessphere distributionheat equation

0 comments

The pith

The critical scaling for attention selectivity depends on the local distance distribution near the query rather than global context features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies softmax self-attention in the long-context limit with a fixed query and n independent uniform random keys on the sphere. It treats the inverse temperature β_n as the parameter that decides whether attention averages everything uniformly or focuses sharply. The authors prove that the threshold where selectivity appears is fixed by the local power-law behavior of distances to the query near zero, giving the explicit scaling β_n^* ≍ n^{2/(d-1)}. They then derive the precise limiting distributions of the ordered attention weights and the output vector in the three regimes below, at, and above this threshold.

Core claim

For i.i.d. uniform keys on the sphere S^{d-1} and fixed query, the attention mechanism undergoes a phase transition at inverse temperature scaling β_n^* ~ n^{2/(d-1)}. Below this scale the output converges to a deterministic local average around the query plus Gaussian fluctuations; exactly at the scale a finite number of nearest keys each receive positive limiting mass; above the scale all mass concentrates on the single closest key. In the subcritical regime with identity value matrix the map approximates the backward heat equation.

What carries the argument

The local exponent of the distance-to-query distribution near zero, which fixes the critical scaling β_n^* ≍ n^{2/(d-1)} and determines the limiting laws of ordered attention weights and outputs across all regimes.

If this is right

Below the critical scale the attention output is a local average around the query with deterministic bias and Gaussian fluctuations.
At the critical scale a finite collection of nearest keys retains macroscopic mass without collapse to a single key.
Above the critical scale all attention mass concentrates on the single closest key.
In the subcritical regime with the identity value matrix the attention map approximates the backward heat equation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If real token embeddings deviate from uniform sphere distribution, the critical scaling for selectivity would shift according to the new local distance exponent.
The regime analysis supplies a concrete way to choose β_n in practice so that long-context attention achieves a chosen balance between averaging and focus.
The same local-exponent approach could be applied to non-uniform or dependent key distributions that better model actual embedding spaces.

Load-bearing premise

The keys are modeled as independent uniform random points on the sphere with a fixed query, producing a specific power-law tail for small distances.

What would settle it

Numerical computation of attention weights for large n at β_n = n^{2/(d-1) - 0.1} showing that the weights do not converge to the predicted local average with explicit bias would falsify the subcritical regime description.

Figures

Figures reproduced from arXiv: 2605.08505 by Giuseppe Bruno, Philippe Rigollet, Shi Chen, Yury Polyanskiy, Zhengjiang Lin.

**Figure 1.** Figure 1: Unified schematic of the ordered-weight and attention-output regimes as the inverse temperature βn varies, with Ynpqq “ ATTpnq ´V q Euclidean distance from key xi to the query. Two qualitatively distinct failure modes flank the regime of interest: if βn is too small, the smallest D2Qs are blurred into the bulk, the weights Aj are essentially uniform, and the attention map fails to distinguish relevant toke… view at source ↗

**Figure 2.** Figure 2: Rescaled output displacement on S 2 for n “ 104 , V “ Id, and ρpxq9 exppx1x2q. The columns use βn “ n 5{4 , n, n3{4 , n1{2 , n1{4 , with the regime-dependent scalings shown above the panels, and the last column is the deterministic drift field. The bottom row is a geodesic chart centered at p1, 1, 1q{? 3 [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: Ordered attention weights for uniform contexts on S 4 . Left: the heatmap shows the empirical mean of Ap1q over 100 trials; the dashed curve βn “ 2n α marks the critical scaling. Right: with βn “ n α{4 , the rescaled ordered weights follow the subcritical prediction e ´x α , where x “ k{mnpqq. 6. Conclusion We introduced a tractable probabilistic model to better understand attention scaling in the long-con… view at source ↗

**Figure 4.** Figure 4: Large blocks are separated by gaps of length m, so the corresponding block variables are independent under m-dependence. Let Gn :“ t1, . . . , nuzBn be the union of the gaps and the final remainder. Then |Gn| “ O ´ n bn ` bn ¯ “ opnq. Since 0 ď Xi,n ď 1, ˇ ˇ ˇ ˇ ˇ źn i“1 p1 ´ Xi,nq ´ ź iPBn p1 ´ Xi,nq ˇ ˇ ˇ ˇ ˇ ď ÿ iPGn Xi,n [PITH_FULL_IMAGE:figures/full_fig_p035_4.png] view at source ↗

read the original abstract

We study the long-context limit of softmax self-attention with a fixed query and a random context of $n$ i.i.d. keys on the sphere, viewing the inverse temperature $\beta_n$ as the scaling parameter that decides whether attention degenerates into uniform averaging or collapses onto the single closest key. We show that the critical scale at which selectivity emerges is determined by the local exponent of the distance-to-query distribution near zero rather than by global features of the context, and scales like $\beta_n^\ast \asymp n^{2/(d-1)}$ for uniform keys on $\mathbb{S}^{d-1}$. Furthermore, we characterize the limiting laws of the ordered attention weights and of the attention output across all regimes of $\beta_n$: a subcritical regime in which the output reduces to a local average around $q$ with explicit deterministic bias and Gaussian fluctuations; a critical regime in which a finite collection of nearest keys retains macroscopic mass without single-key collapse; and a supercritical regime in which all mass concentrates on the closest key. Of notable interest is the subcritical case with identity value matrix where the attention map approximately implements a backward heat equation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper derives the critical scaling β_n^* ≍ n^{2/(d-1)} at which softmax attention switches from averaging to selective focus on nearest keys, plus explicit limiting laws for weights and outputs in the three regimes.

read the letter

The main takeaway is that attention selectivity emerges at a specific scaling of the inverse temperature β_n with context length n, driven by the local power-law behavior of distances near the query rather than global properties. For uniform keys on the sphere this gives β_n^* scaling like n to the power 2 over (d-1), with clean limits on the ordered attention weights and the output vector in subcritical, critical, and supercritical regimes. The subcritical case also yields an approximate backward heat equation when the value matrix is identity, which is a concrete link worth noting.

Referee Report

2 major / 3 minor

Summary. The manuscript analyzes the long-context scaling limits of softmax self-attention with a fixed query and n i.i.d. uniform keys on the sphere S^{d-1}. Treating the inverse temperature β_n as the scaling parameter, it shows that the critical scale at which attention becomes selective (rather than uniform averaging) is β_n^* ≍ n^{2/(d-1)}, governed by the local power-law exponent of the distance-to-query distribution near zero. The paper characterizes the limiting laws of the ordered attention weights and attention output in three regimes: subcritical (local averaging around the query with explicit deterministic bias and Gaussian fluctuations), critical (macroscopic mass retained by a finite number of nearest keys), and supercritical (collapse onto the single closest key). It further notes that the subcritical regime with identity value matrix approximates a backward heat equation.

Significance. If the results hold, the work supplies a rigorous probabilistic framework for phase transitions in attention, demonstrating that selectivity thresholds depend on local geometry of the key distribution rather than global context statistics. The explicit limiting distributions derived via order statistics and extreme-value tools, together with the heat-equation connection, constitute a clear theoretical contribution that could guide analysis of long-context transformers. The parameter-free nature of the critical scaling and the regime-specific characterizations are notable strengths.

major comments (2)

[Main results / critical-scale theorem] The central derivation of β_n^* ≍ n^{2/(d-1)} (stated in the abstract and main theorem) rests on the local exponent of the distance distribution; the manuscript should explicitly derive or cite the spherical-cap volume calculation that produces the factor 2/(d-1) to confirm the exponent is load-bearing and not an artifact of the uniform assumption.
[Subcritical regime analysis] In the subcritical regime, the claim that the attention output reduces to a local average with deterministic bias and Gaussian fluctuations is load-bearing for the heat-equation interpretation; the error bounds or convergence rates (especially for large d) should be stated explicitly so that the approximation's validity for finite n is clear.

minor comments (3)

[Notation and setup] Clarify the precise definition of the ordered attention weights (e.g., whether ties are broken randomly or by index) at the first appearance of the notation.
[Abstract and introduction] The abstract mentions 'all regimes of β_n'; add a short table or diagram summarizing the three regimes, their β_n scalings, and the corresponding limiting behaviors for quick reference.
[Discussion of value matrix] Include a brief remark on how the results extend (or fail to extend) when the value matrix is not the identity, as this affects the heat-equation claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment and constructive comments. We address each major comment below and will incorporate clarifications in the revised manuscript.

read point-by-point responses

Referee: [Main results / critical-scale theorem] The central derivation of β_n^* ≍ n^{2/(d-1)} (stated in the abstract and main theorem) rests on the local exponent of the distance distribution; the manuscript should explicitly derive or cite the spherical-cap volume calculation that produces the factor 2/(d-1) to confirm the exponent is load-bearing and not an artifact of the uniform assumption.

Authors: We agree that an explicit derivation strengthens the presentation. In the revision we will add a short subsection deriving the local volume scaling: for uniform K on S^{d-1}, the surface measure yields P(1 - ⟨q, K⟩ ≤ t) ∼ c_d t^{(d-1)/2} as t → 0 (via the standard parametrization of spherical caps and the quadratic approximation cos θ ≈ 1 - θ²/2). This directly produces the critical scaling β_n^* ≍ n^{2/(d-1)} through the extreme-value analysis of the maximum inner product and is a consequence of local Euclidean geometry rather than a global artifact of uniformity. We will also cite the relevant spherical-geometry references. revision: yes
Referee: [Subcritical regime analysis] In the subcritical regime, the claim that the attention output reduces to a local average with deterministic bias and Gaussian fluctuations is load-bearing for the heat-equation interpretation; the error bounds or convergence rates (especially for large d) should be stated explicitly so that the approximation's validity for finite n is clear.

Authors: The subcritical theorems establish convergence in distribution to the stated local-average limit (with explicit bias and Gaussian fluctuations) as n → ∞ under β_n = o(n^{2/(d-1)}). In the revision we will add a remark on convergence rates, noting that the contribution of distant keys decays exponentially in the subcritical regime and that the Gaussian approximation error can be controlled via standard Berry–Esseen bounds on the order statistics of the inner products. The bias and variance constants depend on d through the cap-volume prefactor, which we will make explicit; the results hold for fixed d with n → ∞, and we will clarify the finite-n regime of validity. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper derives scaling limits and regime transitions for softmax attention directly from the explicit model of i.i.d. uniform keys on S^{d-1} with fixed query. The critical exponent β_n^* ≍ n^{2/(d-1)} follows from the local volume scaling of spherical caps (distance distribution near zero has power d-1), a standard geometric fact applied via extreme-value statistics. Limiting laws for ordered weights and attention output in sub-, critical, and super-critical regimes are obtained from tail asymptotics and order statistics without fitted parameters, self-definitions, or load-bearing self-citations. All steps remain within the stated probabilistic assumptions and use classical tools, rendering the chain self-contained.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The model rests on standard probabilistic assumptions about i.i.d. uniform keys on the sphere and classical limit theorems; no new entities are postulated and the only free parameter is the scaling sequence β_n itself.

free parameters (2)

β_n
Inverse-temperature scaling parameter whose growth rate relative to n determines the regime.
d
Dimension of the sphere on which keys lie; treated as fixed but arbitrary.

axioms (2)

domain assumption Keys are i.i.d. uniform on the unit sphere S^{d-1}
Central modeling assumption stated in the abstract.
standard math Standard results from extreme-value theory and local limit laws for distances on the sphere
Invoked to obtain the local exponent of the distance distribution near zero.

pith-pipeline@v0.9.0 · 5516 in / 1516 out tokens · 32410 ms · 2026-05-12T01:41:42.710358+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

260 extracted references · 260 canonical work pages · 2 internal anchors

[1]

NeurIPS , year=

Big bird: Transformers for longer sequences , author=. NeurIPS , year=

work page
[2]

Transformer-

Dai, Zihang and Yang, Zhilin and Yang, Yiming and Carbonell, Jaime G and Le, Quoc and Salakhutdinov, Ruslan , booktitle=. Transformer-

work page
[3]

2007 , publisher=

Random fields and geometry , author=. 2007 , publisher=

work page 2007
[4]

Token Sample Complexity of Attention , author=

work page
[5]

2025 , note=

Softmax as Linear Attention in the Large-Prompt Regime: a Measure-based Perspective , author=. 2025 , note=

work page 2025
[6]

NeurIPS , year=

A multiscale analysis of mean-field transformers in the moderate interaction regime , author=. NeurIPS , year=

work page
[7]

2017 , publisher=

Fourier analysis on groups , author=. 2017 , publisher=

work page 2017
[8]

2016 , publisher=

A course in abstract harmonic analysis , author=. 2016 , publisher=

work page 2016
[9]

Journal of machine learning research , volume=

A neural probabilistic language model , author=. Journal of machine learning research , volume=

work page
[10]

arXiv preprint arXiv:2601.22156 , year=

Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts , author=. arXiv preprint arXiv:2601.22156 , year=

work page arXiv
[11]

Ye, Xi and Zhang, Wuwei and Yin, Fangcong and Yen, Howard and Chen, Danqi , note=

work page
[12]

arXiv preprint arXiv:2506.16640 , year=

Long-context generalization with sparse attention , author=. arXiv preprint arXiv:2506.16640 , year=

work page arXiv
[13]

NeurIPS , year=

Scale-invariant attention , author=. NeurIPS , year=

work page
[14]

Findings of the Association for Computational Linguistics: ACL 2024 , pages=

Length generalization of causal transformers without position encoding , author=. Findings of the Association for Computational Linguistics: ACL 2024 , pages=

work page 2024
[15]

ICML , year=

Softmax is not enough (for sharp size generalisation) , author=. ICML , year=

work page
[16]

Qu, Jingang and Holzm. Tab

work page
[17]

NeurIPS , year=

Selective attention: Enhancing transformer through principled context control , author=. NeurIPS , year=

work page
[18]

Frank W. J. Olver , title =. 1997 , note =

work page 1997
[19]

Lecture Notes-Monograph Series , pages=

Poisson-kingman partitions , author=. Lecture Notes-Monograph Series , pages=. 2003 , publisher=

work page 2003
[20]

Olav Kallenberg , title =

work page
[21]

2008 , publisher=

An introduction to the theory of point processes: volume II: general theory and structure , author=. 2008 , publisher=

work page 2008
[22]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Llama 2: Open foundation and fine-tuned chat models , author=. arXiv preprint arXiv:2307.09288 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[23]

2024 , url =

Keller Jordan and Jeremy Bernstein and Brendan Rappazzo and @fernbear.bsky.social and Boza Vlado and You Jiacheng and Franz Cesista and Braden Koszarsky and @Grad62304977 , title =. 2024 , url =

work page 2024
[24]

Query-Key Normalization for

Query-key normalization for transformers , author=. arXiv preprint arXiv:2010.04245 , year=

work page arXiv 2010
[25]

ICML , year=

On layer normalization in the transformer architecture , author=. ICML , year=

work page
[26]

ICLR , year=

Critical attention scaling in long-context transformers , author=. ICLR , year=

work page
[27]

Generating Long Sequences with Sparse Transformers

Generating long sequences with sparse transformers , author=. arXiv preprint arXiv:1904.10509 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1904
[28]

The random graph process is globally synchronizing , year =

Vishesh Jain and Clayton Mizgerd and Mehtaab Sawhney , note =. The random graph process is globally synchronizing , year =

work page
[29]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Swin transformer: Hierarchical vision transformer using shifted windows , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

work page
[30]

Longformer: The long-document transformer , author=

work page
[31]

, journal =

Derrida, B. , journal =. 1981 , publisher =

work page 1981
[32]

2025 , note=

Normalization in Attention Dynamics , author=. 2025 , note=

work page 2025
[33]

2025 , note=

Residual connections provably mitigate oversmoothing in graph neural networks , author=. 2025 , note=

work page 2025
[34]

2025 , note=

Quantitative Clustering in Mean-Field Transformer Models , author=. 2025 , note=

work page 2025
[35]

ICLR , year=

Two failure modes of deep transformers and how to avoid them: a unified theory of signal propagation at initialisation , author=. ICLR , year=

work page
[36]

Puvvada, Krishna C. and Ladhak, Faisal and Akle Serano, Santiago and Hsieh, Cheng-Ping and Acharya, Shantanu and Majumdar, Somshubra and Jia, Fei and Kriman, Samuel and Sun, Simeng and Rekesh, Dima and Ginsburg, Boris , booktitle=

work page
[37]

2024 , note=

Extending llms' context window with 100 samples , author=. 2024 , note=

work page 2024
[38]

2022 , note=

Overcoming a theoretical limitation of self-attention , author=. 2022 , note=

work page 2022
[39]

2023 , note=

Qwen technical report , author=. 2023 , note=

work page 2023
[40]

Bowen Peng and Jeffrey Quesnelle and Honglu Fan and Enrico Shippole , year=. Yarn:

work page
[41]

2025 , note=

Scalable-Softmax Is Superior for Attention , author=. 2025 , note=

work page 2025
[42]

2016 , note=

Gaussian error linear units (gelus) , author=. 2016 , note=

work page 2016
[43]

2014 , note=

On the complete phase synchronization for the Kuramoto model in the mean-field limit , author=. 2014 , note=

work page 2014
[44]

2024 , note=

Solutions of stationary McKean-Vlasov equation on a high-dimensional sphere and other Riemannian manifolds , author=. 2024 , note=

work page 2024
[45]

Physica D: Nonlinear Phenomena , volume=

Emergence of phase concentration for the Kuramoto--Sakaguchi equation , author=. Physica D: Nonlinear Phenomena , volume=. 2020 , publisher=

work page 2020
[46]

2019 , publisher=

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , booktitle=. 2019 , publisher=

work page 2019
[47]

Annals of Mathematics , volume=

Asymptotics for a class of non-linear evolution equations, with applications to geometric problems , author=. Annals of Mathematics , volume=. 1983 , publisher=

work page 1983
[48]

2024 , note=

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context , author=. 2024 , note=

work page 2024
[49]

2025 , note=

Synchronization of mean-field models on the circle , author=. 2025 , note=

work page 2025
[50]

Physica D: Nonlinear Phenomena , volume=

On the complete synchronization of the Kuramoto phase model , author=. Physica D: Nonlinear Phenomena , volume=. 2010 , publisher=

work page 2010
[51]

Inventiones mathematicae , volume=

On the trend to global equilibrium for spatially inhomogeneous kinetic systems: the Boltzmann equation , author=. Inventiones mathematicae , volume=. 2005 , publisher=

work page 2005
[52]

2024 , note=

Synchronization on circles and spheres with nonlinear interactions , author=. 2024 , note=

work page 2024
[53]

International conference on machine learning , pages=

Transformers are rnns: Fast autoregressive transformers with linear attention , author=. International conference on machine learning , pages=. 2020 , organization=

work page 2020
[54]

Advances in Neural Information Processing Systems , volume=

The emergence of clusters in self-attention dynamics , author=. Advances in Neural Information Processing Systems , volume=

work page
[55]

2025 , note=

Analysis of mean-field models arising from self-attention dynamics in transformer architectures with layer normalization , author=. 2025 , note=

work page 2025
[56]

A mathematical perspective on transformers , author=. Bull. Amer. Math. Soc. , year=

work page
[57]

International Conference on Artificial Intelligence and Statistics , pages=

Sinkformers: Transformers with doubly stochastic attention , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2022 , organization=

work page 2022
[58]

Advances in Neural Information Processing Systems , volume=

Redesigning the transformer architecture with insights from multi-particle dynamical systems , author=. Advances in Neural Information Processing Systems , volume=

work page
[59]

2019 , note=

Understanding and improving transformer from a multi-particle dynamic system point of view , author=. 2019 , note=

work page 2019
[60]

NeurIPS , year=

Attention is all you need , author=. NeurIPS , year=

work page
[61]

Annales de l'Institut Henri Poincar

On the trend to global equilibrium for Kuramoto oscillators , author=. Annales de l'Institut Henri Poincar

work page
[62]

IEEE Transactions on Information Theory , year=

Convergence analysis of probability flow ode for score-based generative models , author=. IEEE Transactions on Information Theory , year=

work page
[63]

2009 , publisher=

Optimal transport: old and new , author=. 2009 , publisher=

work page 2009
[64]

Acebr. The. Reviews of modern physics , volume=. 2005 , publisher=

work page 2005
[65]

Asymptotic formation and orbital stability of phase-locked states for the

Choi, Young-Pil and Ha, Seung-Yeal and Jung, Sungeun and Kim, Yongduck , journal=. Asymptotic formation and orbital stability of phase-locked states for the. 2012 , publisher=

work page 2012
[66]

Physics reports , volume=

Synchronization in complex networks , author=. Physics reports , volume=. 2008 , publisher=

work page 2008
[67]

International Symposium on Mathematical Problems in Theoretical Physics: January 23--29, 1975, Kyoto University, Kyoto/Japan , pages=

Self-entrainment of a population of coupled non-linear oscillators , author=. International Symposium on Mathematical Problems in Theoretical Physics: January 23--29, 1975, Kyoto University, Kyoto/Japan , pages=. 1975 , organization=

work page 1975
[68]

1984 , publisher=

Chemical turbulence , author=. 1984 , publisher=

work page 1984
[69]

Neurocomputing , volume=

Roformer: Enhanced transformer with rotary position embedding , author=. Neurocomputing , volume=. 2024 , publisher=

work page 2024
[70]

2025 , note=

Exact Sequence Classification with Hardmax Transformers , author=. 2025 , note=

work page 2025
[71]

2023 , note=

Muse: Text-to-image generation via masked generative transformers , author=. 2023 , note=

work page 2023
[72]

2024 , note=

Active-dormant attention heads: Mechanistically demystifying extreme-token phenomena in llms , author=. 2024 , note=

work page 2024
[73]

Convex optimization: algorithms and complexity , year =

Bubeck, S. Convex optimization: algorithms and complexity , year =

work page
[74]

, booktitle =

Chen, Yongxin and Khong, Sei Zhen and Georgiou, Tryphon T. , booktitle =. On the definiteness of graph Laplacians with negative weights: Geometrical and passivity-based approaches , year =

work page
[75]

High-Dimensional Probability: An Introduction with Applications in Data Science , year =

Vershynin, Roman , publisher =. High-Dimensional Probability: An Introduction with Applications in Data Science , year =

work page
[76]

R. M. Dudley , journal =

work page
[77]

Sriperumbudur and Kenji Fukumizu and Arthur Gretton and Bernhard Sch

Bharath K. Sriperumbudur and Kenji Fukumizu and Arthur Gretton and Bernhard Sch. Electronic Journal of Statistics , pages =

work page
[78]

Deep Learning , year =

Ian Goodfellow AND Yoshua Bengio AND Aaron Courville , publisher =. Deep Learning , year =

work page
[79]

Experimental Mathematics , volume=

Experimental study of energy-minimizing point configurations on spheres , author=. Experimental Mathematics , volume=. 2009 , publisher=

work page 2009
[80]

Universal optimality of the

Cohn, Henry and Kumar, Abhinav and Miller, Stephen and Radchenko, Danylo and Viazovska, Maryna , journal=. Universal optimality of the. 2022 , publisher=

work page 2022

Showing first 80 references.