org/P19-1472

Zhang, M · 2024 · arXiv 2402.04347

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

unclear 1

representative citing papers

Functional Attention: From Pairwise Affinities to Functional Correspondences

cs.LG · 2026-05-29 · unverdicted · novelty 7.0

Functional Attention replaces pairwise softmax attention with structured linear operators inspired by geometric functional maps to produce compact, resolution-invariant representations for operator learning.

Elastic Attention Cores for Scalable Vision Transformers

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

VECA learns effective visual representations using core-periphery attention where patches interact exclusively via a resolution-invariant set of learned core embeddings, achieving linear O(N) complexity while maintaining competitive performance.

Attention to Mamba: A Recipe for Cross-Architecture Distillation

cs.CL · 2026-04-01 · unverdicted · novelty 6.0

A two-stage distillation recipe converts a Pythia-1B Transformer into a Mamba model that preserves performance with perplexity 14.11 versus the teacher's 13.86.

UCAN: Unified Convolutional Attention Network for Expansive Receptive Fields in Lightweight Super-Resolution

cs.CV · 2026-03-12 · unverdicted · novelty 6.0

UCAN unifies window-based spatial attention and Hedgehog Attention with a distillation-based large-kernel module and cross-layer sharing to deliver competitive PSNR at low MACs in lightweight super-resolution.

Lizard: An Efficient Linearization Framework for Large Language Models

cs.CL · 2025-07-11 · unverdicted · novelty 6.0

Lizard linearizes Transformer LLMs via subquadratic attention and adaptive learnable modules, recovering near-original performance while outperforming prior linearization methods on MMLU and associative recall.

citing papers explorer

Showing 5 of 5 citing papers.

Functional Attention: From Pairwise Affinities to Functional Correspondences cs.LG · 2026-05-29 · unverdicted · none · ref 16
Functional Attention replaces pairwise softmax attention with structured linear operators inspired by geometric functional maps to produce compact, resolution-invariant representations for operator learning.
Elastic Attention Cores for Scalable Vision Transformers cs.CV · 2026-05-12 · unverdicted · none · ref 55
VECA learns effective visual representations using core-periphery attention where patches interact exclusively via a resolution-invariant set of learned core embeddings, achieving linear O(N) complexity while maintaining competitive performance.
Attention to Mamba: A Recipe for Cross-Architecture Distillation cs.CL · 2026-04-01 · unverdicted · none · ref 36
A two-stage distillation recipe converts a Pythia-1B Transformer into a Mamba model that preserves performance with perplexity 14.11 versus the teacher's 13.86.
UCAN: Unified Convolutional Attention Network for Expansive Receptive Fields in Lightweight Super-Resolution cs.CV · 2026-03-12 · unverdicted · none · ref 41
UCAN unifies window-based spatial attention and Hedgehog Attention with a distillation-based large-kernel module and cross-layer sharing to deliver competitive PSNR at low MACs in lightweight super-resolution.
Lizard: An Efficient Linearization Framework for Large Language Models cs.CL · 2025-07-11 · unverdicted · none · ref 23
Lizard linearizes Transformer LLMs via subquadratic attention and adaptive learnable modules, recovering near-original performance while outperforming prior linearization methods on MMLU and associative recall.

org/P19-1472

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer