Neural LoFi models deep learning as layer-wise spectral filtering that selects maximal low-degree correlations, yielding a tractable surrogate for hierarchical representation learning beyond the lazy regime.
Inductive bias and spectral properties of single-head attention in high dimensions.arXiv preprint arXiv:2509.24914, 2025
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
Attention pooling produces a free-multiplicative-convolution bulk spectrum and two phase transitions for signal recovery; optimal weights are the top eigenvector of the positional correlation matrix R.
citing papers explorer
-
Deep Learning as Neural Low-Degree Filtering: A Spectral Theory of Hierarchical Feature Learning
Neural LoFi models deep learning as layer-wise spectral filtering that selects maximal low-degree correlations, yielding a tractable surrogate for hierarchical representation learning beyond the lazy regime.
-
How Does Attention Help? Insights from Random Matrices on Signal Recovery from Sequence Models
Attention pooling produces a free-multiplicative-convolution bulk spectrum and two phase transitions for signal recovery; optimal weights are the top eigenvector of the positional correlation matrix R.