Grounding DINO fuses language and vision via feature enhancer, language-guided query selection, and cross-modality decoder in a DINO backbone, achieving 52.5 AP zero-shot on COCO and a new record of 26.1 AP mean on ODinW.
IEEE Transactions on Pattern Analysis and Machine Intelligence39(6), 1137–1149 (2017)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2023 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
Grounding DINO fuses language and vision via feature enhancer, language-guided query selection, and cross-modality decoder in a DINO backbone, achieving 52.5 AP zero-shot on COCO and a new record of 26.1 AP mean on ODinW.