hub

You only look once: Uniﬁed, real-time object detection

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhad i · 2016 · arXiv 1506.02640

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

representative citing papers

A global dataset of continuous urban dashcam driving

cs.CV · 2026-04-01 · accept · novelty 7.0

CROWD is a new global dataset of 51,753 continuous urban dashcam segments spanning over 20,000 hours from 238 countries, with manual labels and automated object detections for routine driving analysis.

DroneScan-YOLO: Redundancy-Aware Lightweight Detection for Tiny Objects in UAV Imagery

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

DroneScan-YOLO reaches 55.3% mAP@50 and 35.6% mAP@50-95 on VisDrone2019-DET by combining 1280x1280 input, RPA-Block pruning, MSFD stride-4 branch, and SAL-NWD loss, beating YOLOv8s by 16.6 and 12.3 points with only 4.1% more parameters.

CODO: An Automated Compiler for Comprehensive Dataflow Optimization

cs.AR · 2026-04-14 · unverdicted · novelty 6.0

CODO automates comprehensive dataflow optimization on FPGAs, achieving 1.45x-4.52x speedups on kernels and up to 33.8x on DNN models over state-of-the-art frameworks.

Improving Layout Representation Learning Across Inconsistently Annotated Datasets via Agentic Harmonization

cs.CV · 2026-04-13 · unverdicted · novelty 6.0

VLM-based harmonization of inconsistent annotations across two document layout corpora raises detection F-score from 0.860 to 0.883 and table TEDS from 0.750 to 0.814 while tightening embedding clusters.

Entropy-Gradient Grounding: Training-Free Evidence Retrieval in Vision-Language Models

cs.CV · 2026-04-09 · unverdicted · novelty 6.0

Entropy-gradient grounding uses model uncertainty to retrieve evidence regions in VLMs, improving performance on detail-critical and compositional tasks across multiple architectures.

Unveiling Hidden Lyman Alpha Emitters in the DESI DR1 Data

astro-ph.GA · 2026-05-12 · unverdicted · novelty 5.0

A CNN detects 19,685 LAEs at z=2-3.5 in DESI DR1 spectra with 95% purity and completeness.

Can LLMs Reason About Attention? Towards Zero-Shot Analysis of Multimodal Classroom Behavior

cs.HC · 2026-04-03 · unverdicted · novelty 4.0

A pipeline uses OpenPose and Gaze-LLE to extract pose and gaze data from classroom videos, deletes the raw footage, and applies an LLM for zero-shot behavioral analysis of student attention.

Label-efficient underwater species classification with logistic regression on frozen foundation model embeddings

cs.CV · 2026-03-31 · accept · novelty 4.0

Logistic regression on frozen DINOv3 features achieves 88.5% macro F1 on the AQUA20 marine species benchmark, matching end-to-end supervised models with only 6% of the labels.

Real-Time Cellist Postural Evaluation With On-Device Computer Vision

cs.HC · 2026-04-19 · unverdicted · novelty 3.0

Cello Evaluator is a real-time postural feedback system for cellists running on current Android phones via on-device computer vision, validated as user-friendly by experts.

AI Driven Soccer Analysis Using Computer Vision

cs.CV · 2026-04-09 · unverdicted · novelty 2.0

A system combining object detection, segmentation, keypoint prediction, and homography transforms soccer video into real-world player positions and tactical statistics.

citing papers explorer

Showing 2 of 2 citing papers after filters.

A global dataset of continuous urban dashcam driving cs.CV · 2026-04-01 · accept · none · ref 58
CROWD is a new global dataset of 51,753 continuous urban dashcam segments spanning over 20,000 hours from 238 countries, with manual labels and automated object detections for routine driving analysis.
Label-efficient underwater species classification with logistic regression on frozen foundation model embeddings cs.CV · 2026-03-31 · accept · none · ref 21
Logistic regression on frozen DINOv3 features achieves 88.5% macro F1 on the AQUA20 marine species benchmark, matching end-to-end supervised models with only 6% of the labels.

You only look once: Uniﬁed, real-time object detection

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer