Title resolution pending

URLhttps://arxiv · 2021 · arXiv 2103.02143

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

unclear 1

representative citing papers

Elastic Attention Cores for Scalable Vision Transformers

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

VECA learns effective visual representations using core-periphery attention where patches interact exclusively via a resolution-invariant set of learned core embeddings, achieving linear O(N) complexity while maintaining competitive performance.

Attention to Mamba: A Recipe for Cross-Architecture Distillation

cs.CL · 2026-04-01 · unverdicted · novelty 6.0

A two-stage distillation recipe converts a Pythia-1B Transformer into a Mamba model that preserves performance with perplexity 14.11 versus the teacher's 13.86.

Rethinking Random Transformers as Adaptive Sequence Smoothers for Sleep Staging

cs.LG · 2026-05-11 · unverdicted · novelty 5.0

Randomly initialized Transformers act as adaptive sequence smoothers for sleep staging via a Random Attention Prior Kernel, with gains mainly from inductive bias rather than training.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Attention to Mamba: A Recipe for Cross-Architecture Distillation cs.CL · 2026-04-01 · unverdicted · none · ref 24
A two-stage distillation recipe converts a Pythia-1B Transformer into a Mamba model that preserves performance with perplexity 14.11 versus the teacher's 13.86.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer