Going deeper with image transformers

[TCS+21] Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, Hervé Jégou · 2021 · arXiv 2103.17239

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

TFM-Retouche: A Lightweight Input-Space Adapter for Tabular Foundation Models

cs.LG · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

TFM-Retouche is an architecture-agnostic input-space residual adapter that improves tabular foundation model accuracy on 51 datasets by learning input corrections through the frozen backbone, with an identity guard to fall back to the original model.

BEiT: BERT Pre-Training of Image Transformers

cs.CV · 2021-06-15 · conditional · novelty 7.0

BEiT pre-trains vision transformers via masked image modeling on visual tokens and reaches 83.2% ImageNet top-1 accuracy for the base model and 86.3% for the large model using only ImageNet-1K data.

citing papers explorer

Showing 2 of 2 citing papers.

TFM-Retouche: A Lightweight Input-Space Adapter for Tabular Foundation Models cs.LG · 2026-05-07 · unverdicted · none · ref 30 · 2 links
TFM-Retouche is an architecture-agnostic input-space residual adapter that improves tabular foundation model accuracy on 51 datasets by learning input corrections through the frozen backbone, with an identity guard to fall back to the original model.
BEiT: BERT Pre-Training of Image Transformers cs.CV · 2021-06-15 · conditional · none · ref 17
BEiT pre-trains vision transformers via masked image modeling on visual tokens and reaches 83.2% ImageNet top-1 accuracy for the base model and 86.3% for the large model using only ImageNet-1K data.

Going deeper with image transformers

fields

years

verdicts

representative citing papers

citing papers explorer