pith. sign in

arxiv: 1902.04103 · v3 · pith:5JGNZFZKnew · submitted 2019-02-11 · 💻 cs.CV

Bag of Freebies for Training Object Detection Neural Networks

classification 💻 cs.CV
keywords trainingmodelsdetectionfreebieshoweverimprovemodelneural
0
0 comments X
read the original abstract

Training heuristics greatly improve various image classification model accuracies~\cite{he2018bag}. Object detection models, however, have more complex neural network structures and optimization targets. The training strategies and pipelines dramatically vary among different models. In this works, we explore training tweaks that apply to various models including Faster R-CNN and YOLOv3. These tweaks do not change the model architectures, therefore, the inference costs remain the same. Our empirical results demonstrate that, however, these freebies can improve up to 5% absolute precision compared to state-of-the-art baselines.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. YOLOX: Exceeding YOLO Series in 2021

    cs.CV 2021-07 accept novelty 6.0

    YOLOX exceeds prior YOLO models by adopting anchor-free detection, decoupled heads, and SimOTA assignment to reach 50.0% AP on COCO for the large variant.

  2. Centralized Copy-Paste: Enhanced Data Augmentation Strategy for Wildland Fire Semantic Segmentation

    cs.CV 2025-07 unverdicted novelty 5.0

    CCPDA augments training data for wildland fire semantic segmentation by centralizing and pasting fire clusters, outperforming standard augmentations on fire-class metrics via multi-objective optimization.

  3. A unified neural network for object detection, multiple object tracking and vehicle re-identification

    cs.CV 2019-07 unverdicted novelty 3.0

    Faster RCNN is extended with a track branch and trained end-to-end on concatenated video frames to unify detection and re-identification, reaching 57.79% mAP on the AIC19 vehicle dataset.