CROWD is a new global dataset of 51,753 continuous urban dashcam segments spanning over 20,000 hours from 238 countries, with manual labels and automated object detections for routine driving analysis.
hub
You only look once: Unified, real-time object detection
10 Pith papers cite this work. Polarity classification is still indexing.
hub tools
years
2026 10representative citing papers
DroneScan-YOLO reaches 55.3% mAP@50 and 35.6% mAP@50-95 on VisDrone2019-DET by combining 1280x1280 input, RPA-Block pruning, MSFD stride-4 branch, and SAL-NWD loss, beating YOLOv8s by 16.6 and 12.3 points with only 4.1% more parameters.
CODO automates comprehensive dataflow optimization on FPGAs, achieving 1.45x-4.52x speedups on kernels and up to 33.8x on DNN models over state-of-the-art frameworks.
VLM-based harmonization of inconsistent annotations across two document layout corpora raises detection F-score from 0.860 to 0.883 and table TEDS from 0.750 to 0.814 while tightening embedding clusters.
Entropy-gradient grounding uses model uncertainty to retrieve evidence regions in VLMs, improving performance on detail-critical and compositional tasks across multiple architectures.
A CNN detects 19,685 LAEs at z=2-3.5 in DESI DR1 spectra with 95% purity and completeness.
A pipeline uses OpenPose and Gaze-LLE to extract pose and gaze data from classroom videos, deletes the raw footage, and applies an LLM for zero-shot behavioral analysis of student attention.
Logistic regression on frozen DINOv3 features achieves 88.5% macro F1 on the AQUA20 marine species benchmark, matching end-to-end supervised models with only 6% of the labels.
Cello Evaluator is a real-time postural feedback system for cellists running on current Android phones via on-device computer vision, validated as user-friendly by experts.
A system combining object detection, segmentation, keypoint prediction, and homography transforms soccer video into real-world player positions and tactical statistics.
citing papers explorer
-
A global dataset of continuous urban dashcam driving
CROWD is a new global dataset of 51,753 continuous urban dashcam segments spanning over 20,000 hours from 238 countries, with manual labels and automated object detections for routine driving analysis.
-
Label-efficient underwater species classification with logistic regression on frozen foundation model embeddings
Logistic regression on frozen DINOv3 features achieves 88.5% macro F1 on the AQUA20 marine species benchmark, matching end-to-end supervised models with only 6% of the labels.