The iNaturalist Species Classification and Detection Dataset

Alex Shepard; Chen Sun; Grant Van Horn; Hartwig Adam; Oisin Mac Aodha; Pietro Perona; Serge Belongie; Yang Song; Yin Cui

arxiv: 1707.06642 · v2 · pith:CUYKDLTFnew · submitted 2017-07-20 · 💻 cs.CV

The iNaturalist Species Classification and Detection Dataset

Grant Van Horn , Oisin Mac Aodha , Yang Song , Yin Cui , Chen Sun , Alex Shepard , Hartwig Adam , Pietro Perona

show 1 more author

Serge Belongie

This is my paper

classification 💻 cs.CV

keywords classificationspeciesdatasetdetectionimagesworldcomputerdifferent

0 comments

read the original abstract

Existing image classification datasets used in computer vision tend to have a uniform distribution of images across object categories. In contrast, the natural world is heavily imbalanced, as some species are more abundant and easier to photograph than others. To encourage further progress in challenging real world conditions we present the iNaturalist species classification and detection dataset, consisting of 859,000 images from over 5,000 different species of plants and animals. It features visually similar species, captured in a wide variety of situations, from all over the world. Images were collected with different camera types, have varying image quality, feature a large class imbalance, and have been verified by multiple citizen scientists. We discuss the collection of the dataset and present extensive baseline experiments using state-of-the-art computer vision classification and detection models. Results show that current non-ensemble based methods achieve only 67% top one classification accuracy, illustrating the difficulty of the dataset. Specifically, we observe poor results for classes with small numbers of training examples suggesting more attention is needed in low-shot learning.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SAM 3: Segment Anything with Concepts
cs.CV 2025-11 unverdicted novelty 7.0

SAM 3 introduces promptable concept segmentation that doubles accuracy of prior systems on images and videos while improving standard SAM segmentation performance.
Modular Multimodal Classification Without Fine-Tuning: A Simple Compositional Approach
cs.LG 2026-05 unverdicted novelty 5.0

CoMET achieves strong multimodal classification performance by composing frozen modality encoders, PCA compression, and tabular foundation models without any training, reaching state-of-the-art on diverse benchmarks i...