Empirical benchmark finds attention-based models (SwinTiny, CoAtNet0, MaxViTTiny) achieve highest AUC above 84% on RFMiD binary screening and best F1 scores on multi-label task, with VLMs competitive but not superior and external Messidor-2 AUC 66.8-84.7%.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Benchmarking Convolutional, Transformer, Hybrid, and Vision Language Models for Multi Disease Retinal Screening
Empirical benchmark finds attention-based models (SwinTiny, CoAtNet0, MaxViTTiny) achieve highest AUC above 84% on RFMiD binary screening and best F1 scores on multi-label task, with VLMs competitive but not superior and external Messidor-2 AUC 66.8-84.7%.