Attention pooling produces a free-multiplicative-convolution bulk spectrum and two phase transitions for signal recovery; optimal weights are the top eigenvector of the positional correlation matrix R.
Step 0: Pooled representation and bulk reduction.Recall that each pooled row of C can be written as cn = TX t=1 wt X(n) t = TX t=1 wtξn,t µ+ TX t=1 wt zxn,t
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
stat.ML 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
How Does Attention Help? Insights from Random Matrices on Signal Recovery from Sequence Models
Attention pooling produces a free-multiplicative-convolution bulk spectrum and two phase transitions for signal recovery; optimal weights are the top eigenvector of the positional correlation matrix R.