pith. machine review for the scientific record. sign in

Token pooling in vision transformers

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.AI 1 cs.CV 1

years

2026 1 2022 1

verdicts

UNVERDICTED 2

representative citing papers

Token Merging: Your ViT But Faster

cs.CV · 2022-10-17 · unverdicted · novelty 6.0

Token Merging (ToMe) doubles the throughput of large Vision Transformers on images, video, and audio by merging similar tokens with a fast matching algorithm, incurring only 0.2-0.4% accuracy loss.

citing papers explorer

Showing 2 of 2 citing papers.

  • Why Training-Free Token Reduction Collapses: The Inherent Instability of Pairwise Scoring Signals cs.AI · 2026-04-17 · unverdicted · none · ref 34

    Pairwise scoring signals in Vision Transformer token reduction are inherently unstable due to high perturbation counts and degrade in deep layers, causing collapse, while unary signals with triage enable CATIS to retain 96.9% accuracy at 63% FLOPs reduction on ViT-Large ImageNet-1K.

  • Token Merging: Your ViT But Faster cs.CV · 2022-10-17 · unverdicted · none · ref 8

    Token Merging (ToMe) doubles the throughput of large Vision Transformers on images, video, and audio by merging similar tokens with a fast matching algorithm, incurring only 0.2-0.4% accuracy loss.