Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey

· 2019 · cs.CV · arXiv 1902.06162

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Large-scale labeled data are generally required to train deep neural networks in order to obtain better performance in visual feature learning from images or videos for computer vision applications. To avoid extensive cost of collecting and annotating large-scale datasets, as a subset of unsupervised learning methods, self-supervised learning methods are proposed to learn general image and video features from large-scale unlabeled data without using any human-annotated labels. This paper provides an extensive review of deep learning-based self-supervised general visual feature learning methods from images or videos. First, the motivation, general pipeline, and terminologies of this field are described. Then the common deep neural network architectures that used for self-supervised learning are summarized. Next, the main components and evaluation metrics of self-supervised learning methods are reviewed followed by the commonly used image and video datasets and the existing self-supervised visual feature learning methods. Finally, quantitative performance comparisons of the reviewed methods on benchmark datasets are summarized and discussed for both image and video feature learning. At last, this paper is concluded and lists a set of promising future directions for self-supervised visual feature learning.

representative citing papers

Machine Intelligence that Understands Visual and Linguistic Information and Interacts with Humans and Environments

cs.CV · 2026-05-20 · unverdicted · novelty 4.0

Introduces GRIT, LTMI, and a hierarchical attention framework claiming performance gains on image captioning, visual dialog, and ALFRED instruction following.

Accurate and Robust Pulmonary Nodule Detection by 3D Feature Pyramid Network with Self-supervised Feature Learning

eess.IV · 2019-07-25 · unverdicted · novelty 4.0

A 3DFPN with self-supervised pretraining and HS2 false-positive reduction using location history images reaches 90.6% sensitivity at 0.125 FP/scan on LUNA16, claimed 15.8% above prior results.

Machine Learning Approaches for Improved Scalability of Metallic Magnetic Calorimeters

physics.ins-det · 2026-06-23 · unverdicted · novelty 3.0

Machine learning methods are explored for pulse classification, artifact rejection, and shape analysis in metallic magnetic calorimeters to improve scalability over traditional signal processing.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Machine Intelligence that Understands Visual and Linguistic Information and Interacts with Humans and Environments cs.CV · 2026-05-20 · unverdicted · none · ref 11 · internal anchor
Introduces GRIT, LTMI, and a hierarchical attention framework claiming performance gains on image captioning, visual dialog, and ALFRED instruction following.

Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey

fields

years

verdicts

representative citing papers

citing papers explorer