Attention mechanisms trained on Gaussian data learn parameters aligned with the principal eigenvectors of the covariance matrix, establishing an explicit link to PCA in both finite and infinite prompt regimes.
SinceRICL ∞is coercive, these two points are the global minimizers of the function
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
math.OC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Attention-based PCA
Attention mechanisms trained on Gaussian data learn parameters aligned with the principal eigenvectors of the covariance matrix, establishing an explicit link to PCA in both finite and infinite prompt regimes.