pith. sign in

arxiv: 1802.02899 · v3 · pith:6TQHAWPEnew · submitted 2018-02-07 · 💻 cs.CV

From Selective Deep Convolutional Features to Compact Binary Representations for Image Retrieval

classification 💻 cs.CV
keywords imageretrievalrepresentationsconvolutionalfeaturesproposebinarycompact
0
0 comments X
read the original abstract

In the large-scale image retrieval task, the two most important requirements are the discriminability of image representations and the efficiency in computation and storage of representations. Regarding the former requirement, Convolutional Neural Network (CNN) is proven to be a very powerful tool to extract highly discriminative local descriptors for effective image search. Additionally, in order to further improve the discriminative power of the descriptors, recent works adopt fine-tuned strategies. In this paper, taking a different approach, we propose a novel, computationally efficient, and competitive framework. Specifically, we firstly propose various strategies to compute masks, namely SIFT-mask, SUM-mask, and MAX-mask, to select a representative subset of local convolutional features and eliminate redundant features. Our in-depth analyses demonstrate that proposed masking schemes are effective to address the burstiness drawback and improve retrieval accuracy. Secondly, we propose to employ recent embedding and aggregating methods which can significantly boost the feature discriminability. Regarding the computation and storage efficiency, we include a hashing module to produce very compact binary image representations. Extensive experiments on six image retrieval benchmarks demonstrate that our proposed framework achieves the state-of-the-art retrieval performances.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. BTEL: A Binary Tree Encoding Approach for Visual Localization

    cs.CV 2019-06 unverdicted novelty 6.0

    BTEL uses a binary tree structure to encode visual features for localization, delivering sub-linear storage and query time while remaining agnostic to front-end descriptors.