A Twins-SVT vision transformer backbone with multiscale CNN decoder and Category Focus Module auxiliary task reduces MAE by 33-64% on VisDrone and iSAID multi-class counting benchmarks versus prior density estimators.
Deep regression versus detection for counting in robotic phenotyping.IEEE Robotics and Automation Letters, 6(2):2902–2907
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Getting the Numbers Right$\unicode{x2014}$Modelling Multi-Class Object Counting in Dense and Varied Scenes
A Twins-SVT vision transformer backbone with multiscale CNN decoder and Category Focus Module auxiliary task reduces MAE by 33-64% on VisDrone and iSAID multi-class counting benchmarks versus prior density estimators.