Injecting pre-computed layout priors from RT-DETR into VLM prompts raises markdown F1 from 0.37 to 0.92 on a 10k-page OOD benchmark and cuts infinite-loop failures across domains.
DETRs beat YOLOs on real-time object detection
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 7roles
baseline 1polarities
baseline 1representative citing papers
SARES-DEIM achieves 76.4% mAP50:95 and 93.8% mAP50 on HRSID by routing SAR features through sparse frequency and wavelet experts plus a high-resolution preservation neck, outperforming prior YOLO and SAR detectors.
Lacuna is an LLM-powered research map for ML that outperforms OpenScholar on retrieval benchmarks and GPT-Researcher on multi-stage report generation tasks.
Proposes DERNet with Decompose-Enhance-Reconstruct operator and three plug-and-play modules to shift small object detection from spatial to spectral feature processing, claiming better performance than YOLOv11 with 1/6 the parameters.
Hippocampus-DETR integrates a hippocampal memory network (HipNet) into DETR to simulate brain subregions for pattern separation, completion, and improved detection accuracy plus generalization.
Presents RT-DocLayout, a 33M-parameter end-to-end model extending RT-DETR that unifies layout classification, detection, segmentation, and reading-order prediction at 132.1 FPS with claimed SOTA results on public benchmarks.
Cello Evaluator is a real-time postural feedback system for cellists running on current Android phones via on-device computer vision, validated as user-friendly by experts.
citing papers explorer
-
SARES-DEIM: Sparse Mixture-of-Experts Meets DETR for Robust SAR Ship Detection
SARES-DEIM achieves 76.4% mAP50:95 and 93.8% mAP50 on HRSID by routing SAR features through sparse frequency and wavelet experts plus a high-resolution preservation neck, outperforming prior YOLO and SAR detectors.