pith. machine review for the scientific record. sign in

hub

Deformable DETR: Deformable Transformers for End-to-End Object Detection

44 Pith papers cite this work. Polarity classification is still indexing.

44 Pith papers citing it
abstract

DETR has been recently proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance. However, it suffers from slow convergence and limited feature spatial resolution, due to the limitation of Transformer attention modules in processing image feature maps. To mitigate these issues, we proposed Deformable DETR, whose attention modules only attend to a small set of key sampling points around a reference. Deformable DETR can achieve better performance than DETR (especially on small objects) with 10 times less training epochs. Extensive experiments on the COCO benchmark demonstrate the effectiveness of our approach. Code is released at https://github.com/fundamentalvision/Deformable-DETR.

hub tools

claims ledger

  • abstract DETR has been recently proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance. However, it suffers from slow convergence and limited feature spatial resolution, due to the limitation of Transformer attention modules in processing image feature maps. To mitigate these issues, we proposed Deformable DETR, whose attention modules only attend to a small set of key sampling points around a reference. Deformable DETR can achieve better performance than DETR (especially on small objects) with 10 times less training epochs. Extensive e

co-cited works

years

2026 43 2025 1

clear filters

representative citing papers

Towards Open World Sound Event Detection

cs.SD · 2026-05-05 · unverdicted · novelty 7.0

Introduces OW-SED paradigm and WOOT transformer framework to detect known sounds, identify unseen events, and incrementally learn in open audio environments.

URoPE: Universal Relative Position Embedding across Geometric Spaces

cs.CV · 2026-04-20 · unverdicted · novelty 7.0

URoPE is a parameter-free relative position embedding for transformers that works across arbitrary geometric spaces by ray sampling and projection, yielding consistent gains on novel view synthesis, 3D detection, tracking, and depth estimation.

SynthPID: P&ID digitization from Topology-Preserving Synthetic Data

cs.CV · 2026-04-15 · conditional · novelty 7.0

Topology-preserving synthetic P&IDs generated by seeding from real drawings enable models trained solely on synthetics to achieve 63.8% edge mAP on real P&ID benchmarks, closing most of the gap to real-data training.

Online Reasoning Video Object Segmentation

cs.CV · 2026-04-13 · unverdicted · novelty 7.0

The work introduces the ORVOS task, the ORVOSB benchmark with causal annotations across 210 videos, and a baseline using updated prompts plus a temporal token reservoir.

GateMOT: Q-Gated Attention for Dense Object Tracking

cs.CV · 2026-04-29 · unverdicted · novelty 6.0

GateMOT proposes Q-Gated Attention to enable linear-complexity, spatially aware attention for state-of-the-art dense object tracking on benchmarks like BEE24.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • Towards Open World Sound Event Detection cs.SD · 2026-05-05 · unverdicted · none · ref 14 · internal anchor

    Introduces OW-SED paradigm and WOOT transformer framework to detect known sounds, identify unseen events, and incrementally learn in open audio environments.