pith. machine review for the scientific record. sign in

hub

YOLOX: Exceeding YOLO Series in 2021

24 Pith papers cite this work. Polarity classification is still indexing.

24 Pith papers citing it
abstract

In this report, we present some experienced improvements to YOLO series, forming a new high-performance detector -- YOLOX. We switch the YOLO detector to an anchor-free manner and conduct other advanced detection techniques, i.e., a decoupled head and the leading label assignment strategy SimOTA to achieve state-of-the-art results across a large scale range of models: For YOLO-Nano with only 0.91M parameters and 1.08G FLOPs, we get 25.3% AP on COCO, surpassing NanoDet by 1.8% AP; for YOLOv3, one of the most widely used detectors in industry, we boost it to 47.3% AP on COCO, outperforming the current best practice by 3.0% AP; for YOLOX-L with roughly the same amount of parameters as YOLOv4-CSP, YOLOv5-L, we achieve 50.0% AP on COCO at a speed of 68.9 FPS on Tesla V100, exceeding YOLOv5-L by 1.8% AP. Further, we won the 1st Place on Streaming Perception Challenge (Workshop on Autonomous Driving at CVPR 2021) using a single YOLOX-L model. We hope this report can provide useful experience for developers and researchers in practical scenes, and we also provide deploy versions with ONNX, TensorRT, NCNN, and Openvino supported. Source code is at https://github.com/Megvii-BaseDetection/YOLOX.

hub tools

representative citing papers

AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics

cs.CV · 2026-05-05 · unverdicted · novelty 7.0 · 3 refs

AniMatrix generates anime videos by structuring artistic production rules into a controllable taxonomy and training the model to prioritize those rules over physical realism, achieving top scores from professional animators on prompt understanding and artistic motion.

GateMOT: Q-Gated Attention for Dense Object Tracking

cs.CV · 2026-04-29 · unverdicted · novelty 6.0

GateMOT proposes Q-Gated Attention to enable linear-complexity, spatially aware attention for state-of-the-art dense object tracking on benchmarks like BEE24.

Portable Active Learning for Object Detection

cs.CV · 2026-05-11 · unverdicted · novelty 5.0

PAL is a portable active learning method for object detection that uses class-specific logistic classifiers for uncertainty and image-level diversity to select annotation batches, showing better label efficiency than baselines on COCO, VOC, and BDD100K.

SAMOFT: Robust Multi-Object Tracking via Region and Flow

cs.CV · 2026-05-10 · unverdicted · novelty 5.0

SAMOFT improves multi-object tracking by using SAM segmentation and optical flow for pixel-level motion matching, flexible centroid correction, and training-free motion pattern fixes on top of standard Kalman and ReID baselines.

World Simulation with Video Foundation Models for Physical AI

cs.CV · 2025-10-28 · unverdicted · novelty 4.0

Cosmos-Predict2.5 unifies text-to-world, image-to-world, and video-to-world generation in one model trained on 200M clips with RL post-training, delivering improved quality and control for physical AI.

citing papers explorer

Showing 24 of 24 citing papers.