Object Detectors Emerge in Deep Scene CNNs

Aditya Khosla; Agata Lapedriza; Antonio Torralba; Aude Oliva; Bolei Zhou

arxiv: 1412.6856 · v2 · pith:DJ6QEGSCnew · submitted 2014-12-22 · 💻 cs.CV · cs.NE

Object Detectors Emerge in Deep Scene CNNs

Bolei Zhou , Aditya Khosla , Agata Lapedriza , Aude Oliva , Antonio Torralba This is my paper

classification 💻 cs.CV cs.NE

keywords scenedetectorsobjectobjectsarchitecturesclassificationcnnsdeep

0 comments

read the original abstract

With the success of new computational architectures for visual processing, such as convolutional neural networks (CNN) and access to image databases with millions of labeled examples (e.g., ImageNet, Places), the state of the art in computer vision is advancing rapidly. One important factor for continued progress is to understand the representations that are learned by the inner layers of these deep architectures. Here we show that object detectors emerge from training CNNs to perform scene classification. As scenes are composed of objects, the CNN for scene classification automatically discovers meaningful objects detectors, representative of the learned scene categories. With object detectors emerging as a result of learning to recognize scenes, our work demonstrates that the same network can perform both scene recognition and object localization in a single forward-pass, without ever having been explicitly taught the notion of objects.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Toy Models of Superposition
cs.LG 2022-09 accept novelty 8.0

Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulne...
Explainable Part-Based Vehicle Classifier with Spatial Awareness
cs.CV 2026-05 unverdicted novelty 4.0

A part-based vehicle classifier using spatial probability maps for parts and softmax regression achieves accuracy comparable to end-to-end CNNs with greater robustness and explainability.
Naturalistic Computational Cognitive Science: Towards generalizable models and theories that capture the full range of natural behavior
q-bio.NC 2025-02 unverdicted novelty 4.0

Advocates integrating naturalistic paradigms and AI progress into cognitive science to develop generalizable models of natural behavior while retaining experimental control and theoretical insight.
Convolutional neural network based decoders for surface codes
quant-ph 2023-12 unverdicted novelty 4.0

Convolutional neural network decoders achieve good performance on surface code error correction and adapt across noise models, with explainable AI used to inspect their decisions.
Predicting Visual Memory Schemas with Variational Autoencoders
cs.CV 2019-07 unverdicted novelty 4.0

Variational autoencoders generate higher-resolution dual-channel visual memory schema maps that separately predict true and false memorability, extending prior CNN approaches.
Naturalistic Computational Cognitive Science: Towards generalizable models and theories that capture the full range of natural behavior
q-bio.NC 2025-02 unverdicted novelty 3.0

Position paper advocating integration of naturalistic paradigms and AI models to create generalizable theories of natural human behavior and cognition.