Attention mechanisms trained on Gaussian data recover principal eigenvectors of the covariance matrix in finite and infinite prompt regimes.
Shivam Garg, Dimitris Tsipras, Percy Liang, and Gregory Valiant
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
An iSVD-based adaptive ROM framework updates reduced bases with occasional full-order snapshots, showing improved accuracy and efficiency over direct adaptation baselines on Burgers, Sod, and rotating detonation engine problems.
Direction maps and pinwheel structures in MT emerge spontaneously when a spatiotemporal deep network is trained on videos with contrastive self-supervised learning and spatial regularization.
OjaKV introduces hybrid full-rank storage for key tokens combined with online low-rank KV cache compression via Oja's algorithm to support memory-efficient long-context LLM inference.
MPCS integrates eleven plasticity mechanisms and reaches a Normalized Efficiency Score of 94.2 on a 31-task benchmark, with ablations showing that removing EWC and Hebbian updates yields higher performance at lower cost.
citing papers explorer
-
Self-organized MT Direction Maps Emerge from Spatiotemporal Contrastive Optimization
Direction maps and pinwheel structures in MT emerge spontaneously when a spatiotemporal deep network is trained on videos with contrastive self-supervised learning and spatial regularization.