Recognition: no theorem link
Telescope: Learnable Hyperbolic Foveation for Ultra-Long-Range Object Detection
Pith reviewed 2026-05-10 18:42 UTC · model grok-4.3
The pith
Telescope uses a learnable hyperbolic foveation layer to raise mAP for objects beyond 250 meters from 0.185 to 0.326.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Telescope combines a standard detection backbone with a re-sampling layer that applies a trainable hyperbolic foveation transformation to the input image. The transformation enlarges the effective resolution of regions containing small, distant objects. On driving scenes this produces a 76 percent relative improvement in mean average precision for detection beyond 250 meters, moving absolute mAP from 0.185 to 0.326, with minimal added cost and no degradation at closer ranges.
What carries the argument
The learnable hyperbolic foveation re-sampling layer, a module that uses a trainable hyperbolic mapping to re-sample the image and allocate higher pixel density to distant scene regions.
If this is right
- Autonomous vehicles can detect critical objects at braking distances required for high-speed highway operation.
- Image-only detection becomes viable for ultra-long ranges without requiring upgraded LiDAR hardware.
- The same model maintains competitive performance at short and medium ranges.
- The approach adds only modest computational overhead to existing detection pipelines.
Where Pith is reading between the lines
- The same re-sampling idea could be tested on other vision tasks where scale varies sharply across an image, such as aerial surveillance.
- End-to-end training of the foveation parameters may allow similar gains in domains where biological foveation has not yet been adapted.
- Combining the layer with temporal fusion across video frames could further extend reliable detection range.
Load-bearing premise
The hyperbolic re-sampling layer must increase effective resolution on distant objects without creating artifacts that harm detection or reduce accuracy on nearby objects.
What would settle it
Run the trained Telescope model on a new set of real highway images containing objects at distances over 250 meters and observe no mAP gain or visible distortions in the transformed image regions.
Figures
read the original abstract
Autonomous highway driving, especially for long-haul heavy trucks, requires detecting objects at long ranges beyond 500 meters to satisfy braking distance requirements at high speeds. At long distances, vehicles and other critical objects occupy only a few pixels in high-resolution images, causing state-of-the-art object detectors to fail. This challenge is compounded by the limited effective range of commercially available LiDAR sensors, which fall short of ultra-long range thresholds because of quadratic loss of resolution with distance, making image-based detection the most practically scalable solution given commercially available sensor constraints. We introduce Telescope, a two-stage detection model designed for ultra-long range autonomous driving. Alongside a powerful detection backbone, this model contains a novel re-sampling layer and image transformation to address the fundamental challenges of detecting small, distant objects. Telescope achieves $76\%$ relative improvement in mAP in ultra-long range detection compared to state-of-the-art methods (improving from an absolute mAP of 0.185 to 0.326 at distances beyond 250 meters), requires minimal computational overhead, and maintains strong performance across all detection ranges.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Telescope, a two-stage object detector for ultra-long-range autonomous driving that incorporates a learnable hyperbolic foveation re-sampling layer and image transformation to increase effective resolution for distant objects occupying few pixels. It reports a 76% relative mAP improvement (from 0.185 to 0.326) for objects beyond 250 m on a highway driving dataset, with claims of no degradation at shorter ranges, minimal computational overhead, and motivation from foveated vision principles implemented in a differentiable manner.
Significance. If the reported gains hold under rigorous verification, the work could meaningfully advance image-based long-range perception for high-speed highway scenarios where LiDAR range is insufficient. The differentiable hyperbolic re-sampling is a concrete technical contribution that aligns with biological foveation and could be adopted in other detectors; the absence of circularity in the empirical gains (as they are not reduced to fitted quantities by construction) strengthens the case for further investigation.
major comments (2)
- [Abstract] Abstract and results: the headline mAP values (0.185 baseline to 0.326) are presented without error bars, standard deviations from multiple runs, or statistical significance tests; this directly affects confidence in the 76% relative improvement claim for the >250 m regime.
- [Results] Evaluation: the distance-thresholded mAP (>250 m) requires explicit details on dataset size, object count in the long-range subset, distance measurement method, and whether the split is fixed or cross-validated; without these, the improvement cannot be fully assessed as robust rather than dataset-specific.
minor comments (2)
- The paper should include an ablation study isolating the contribution of the learnable hyperbolic parameters versus the backbone or other components to confirm the source of the gain.
- [Methods] Clarify the exact parameterization of the hyperbolic foveation layer (e.g., the form of the learnable parameters and their initialization) in the methods section for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive review and the recommendation for minor revision. We appreciate the recognition of the technical contribution and the potential impact on long-range perception. We address each major comment below, indicating where revisions will be incorporated to improve clarity and robustness.
read point-by-point responses
-
Referee: [Abstract] Abstract and results: the headline mAP values (0.185 baseline to 0.326) are presented without error bars, standard deviations from multiple runs, or statistical significance tests; this directly affects confidence in the 76% relative improvement claim for the >250 m regime.
Authors: We acknowledge that reporting variability measures would strengthen the presentation of the headline results. The reported mAP values are from a single training run, as is common in large-scale detection experiments due to computational constraints on our highway dataset. In the revised manuscript we will add an explicit statement in both the abstract and the results section noting the single-run nature of the evaluation and will include additional supporting evidence from ablation studies showing consistent relative gains across multiple distance thresholds and backbone variants. We will also add a brief discussion of why formal significance testing was not performed. revision: partial
-
Referee: [Results] Evaluation: the distance-thresholded mAP (>250 m) requires explicit details on dataset size, object count in the long-range subset, distance measurement method, and whether the split is fixed or cross-validated; without these, the improvement cannot be fully assessed as robust rather than dataset-specific.
Authors: We agree that these details are essential for assessing robustness. The main paper summarized the dataset at a high level while deferring some specifics to the supplementary material. In the revised version we will expand the 'Dataset and Evaluation Protocol' subsection to explicitly state: the total number of images and annotations, the number of objects beyond 250 m in the test set, the distance measurement procedure (camera-LiDAR fusion with GPS ground truth), and confirmation that the train/test split follows the dataset's fixed protocol without cross-validation. These additions will be placed in the main text rather than only in the supplement. revision: yes
Circularity Check
No significant circularity; empirical results self-contained
full rationale
The manuscript introduces a two-stage detector with a differentiable hyperbolic re-sampling layer motivated by foveated vision, then reports mAP gains (0.185 to 0.326) from training and distance-thresholded evaluation on highway data. No equations, uniqueness theorems, or predictions are shown that reduce the reported improvement to a fitted quantity or self-citation by construction. The central claim rests on experimental tables and architecture details that remain independently verifiable against external datasets and baselines.
Axiom & Free-Parameter Ledger
free parameters (1)
- learnable hyperbolic foveation parameters
Reference graph
Works this paper leans on
-
[1]
In: Proceed- ings of the IEEE conference on computer vision and pattern recognition
Bai, Y ., Zhang, Y ., Ding, M., Ghanem, B.: Finding tiny faces in the wild with generative adversarial network. In: Proceed- ings of the IEEE conference on computer vision and pattern recognition. pp. 21–30 (2018)
2018
-
[2]
Perception Encoder: The best visual embeddings are not at the output of the network
Bolya, D., Huang, P.Y ., Sun, P., Cho, J.H., Madotto, A., Wei, C., Ma, T., Zhi, J., Rajasegaran, J., Rasheed, H., et al.: Perception Encoder: The best visual embeddings are not at the output of the network. arXiv preprint arXiv:2504.13181 (2025)
work page internal anchor Pith review arXiv 2025
-
[3]
In: Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition
Caesar, H., Bankiti, V ., Lang, A.H., V ora, S., Liong, V .E., Xu, Q., Krishnan, A., Pan, Y ., Baldan, G., Beijbom, O.: nuScenes: A multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition. pp. 11621–11631 (2020)
2020
-
[4]
SAM 3: Segment Anything with Concepts
Carion, N., Gustafson, L., Hu, Y .T., Debnath, S., Hu, R., Suris, D., Ryali, C., Alwala, K.V ., Khedr, H., Huang, A.e.a.: SAM 3: Segment anything with concepts. arXiv preprint arXiv:2511.16719 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[5]
In: European conference on computer vision
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with trans- formers. In: European conference on computer vision. pp. 213–229. Springer (2020)
2020
-
[6]
In: Proceedings of the IEEE/CVF conference on com- puter vision and pattern recognition
Chang, M.F., Lambert, J., Sangkloy, P., Singh, J., Bak, S., Hartnett, A., Wang, D., Carr, P., Lucey, S., Ramanan, D., et al.: Argoverse: 3D tracking and forecasting with rich maps. In: Proceedings of the IEEE/CVF conference on com- puter vision and pattern recognition. pp. 8748–8757 (2019)
2019
-
[7]
In: Asian conference on computer vision
Chen, C., Liu, M.Y ., Tuzel, O., Xiao, J.: R-CNN for small object detection. In: Asian conference on computer vision. pp. 214–230. Springer (2016)
2016
-
[8]
MMDetection: Open mmlab detection toolbox and benchmark,
Chen, K., Wang, J., Pang, J., Cao, Y ., Xiong, Y ., Li, X., Sun, S., Feng, W., Liu, Z., Xu, J., Zhang, Z., Cheng, D., Zhu, C., Cheng, T., Zhao, Q., Li, B., Lu, X., Zhu, R., Wu, Y ., Dai, J., Wang, J., Shi, J., Ouyang, W., Loy, C.C., Lin, D.: MMDetec- tion: Open MMLab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
-
[9]
IEEE transactions on pattern analysis and machine intelligence45, 13467–13488 (2023)
Cheng, G., Yuan, X., Yao, X., Yan, K., Zeng, Q., Xie, X., Han, J.: Towards large-scale small object detection: Survey and benchmarks. IEEE transactions on pattern analysis and machine intelligence45, 13467–13488 (2023)
2023
-
[10]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3213–3223 (2016)
2016
-
[11]
In: Proceedings of the IEEE/CVF interna- tional conference on computer vision
Dai, X., Chen, Y ., Yang, J., Zhang, P., Yuan, L., Zhang, L.: Dynamic DETR: End-to-end object detection with dy- namic attention. In: Proceedings of the IEEE/CVF interna- tional conference on computer vision. pp. 2988–2997 (2021)
2021
-
[12]
In: Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition
Dai, Z., Cai, B., Lin, Y ., Chen, J.: UP-DETR: Unsupervised pre-training for object detection with transformers. In: Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 1601–1610 (2021)
2021
-
[13]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
De Plaen, H., De Plaen, P.F., Suykens, J.A., Proesmans, M., Tuytelaars, T., Van Gool, L.: Unbalanced optimal transport: A unified framework for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 3198–3207 (2023)
2023
-
[14]
In: 2009 IEEE conference on computer vision and pattern recog- nition
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recog- nition. pp. 248–255. Ieee (2009)
2009
-
[15]
In: 2012 IEEE conference on computer vision and pattern recog- nition
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for au- tonomous driving? The KITTI vision benchmark suite. In: 2012 IEEE conference on computer vision and pattern recog- nition. pp. 3354–3361. IEEE (2012)
2012
-
[16]
In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (2026) 9
Ghilotti, F., Palladin, E., Brucker, S., Sigal, A., Bijelic, M., Heide, F.: TruckDrive: Long-range autonomous highway driving dataset. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (2026) 9
2026
-
[17]
In: Pro- ceedings of the IEEE/CVF winter conference on applications of computer vision
Gong, Y ., Yu, X., Ding, Y ., Peng, X., Zhao, J., Han, Z.: Ef- fective fusion factor in FPN for tiny object detection. In: Pro- ceedings of the IEEE/CVF winter conference on applications of computer vision. pp. 1160–1168 (2021)
2021
-
[18]
IEEE transactions on circuits and systems for video technology34(1), 221–234 (2023)
Guo, G., Chen, P., Yu, X., Han, Z., Ye, Q., Gao, S.: Save the tiny, save the all: Hierarchical activation network for tiny object detection. IEEE transactions on circuits and systems for video technology34(1), 221–234 (2023)
2023
-
[19]
YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review
Hidayatullah, P., Syakrani, N., Sholahuddin, M.R., Gelar, T., Tubagus, R.: YOLOv8 to YOLO11: A comprehen- sive architecture in-depth comparative review. arXiv preprint arXiv:2501.13400 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[20]
In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops
Huang, X., Cheng, X., Geng, Q., Cao, B., Zhou, D., Wang, P., Lin, Y ., Yang, R.: The Apolloscape dataset for au- tonomous driving. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. pp. 954–960 (2018)
2018
-
[21]
arXiv preprint arXiv:2205.04529 (2022)
Jabbireddy, S., Sun, X., Meng, X., Varshney, A.: Foveated rendering: Motivation, taxonomy, and research directions. arXiv preprint arXiv:2205.04529 (2022)
-
[22]
Advances in neural information pro- cessing systems28(2015)
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. Advances in neural information pro- cessing systems28(2015)
2015
-
[23]
IEEE signal processing letters28, 1026–1030 (2021)
Lee, G., Hong, S., Cho, D.: Self-supervised feature enhance- ment networks for small object detection in noisy images. IEEE signal processing letters28, 1026–1030 (2021)
2021
-
[24]
Lee, J.M.: Introduction to Riemannian manifolds, vol. 2. Springer (2018)
2018
-
[25]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Li, F., Zhang, H., Liu, S., Guo, J., Ni, L.M., Zhang, L.: DN-DETR: Accelerate DETR training by introducing query denoising. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 13619–13627 (2022)
2022
-
[26]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Lin, T.Y ., Doll´ar, P., Girshick, R., He, K., Hariharan, B., Be- longie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2117–2125 (2017)
2017
-
[27]
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing17, 15721–15734 (2024)
Liu, J., Zhang, J., Ni, Y ., Chi, W., Qi, Z.: Small-object de- tection in remote sensing images with super-resolution per- ception. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing17, 15721–15734 (2024)
2024
-
[28]
In: European conference on computer vision
Liu, S., Zeng, Z., Ren, T., Li, F., Zhang, H., Yang, J., Jiang, Q., Li, C., Yang, J., Su, H., et al.: Grounding DINO: Mar- rying DINO with grounded pre-training for open-set object detection. In: European conference on computer vision. pp. 38–55. Springer (2024)
2024
-
[29]
International Journal of Computer Vision131(8), 1909–1963 (2023)
Mao, J., Shi, S., Wang, X., Li, H.: 3D object detection for autonomous driving: A comprehensive survey. International Journal of Computer Vision131(8), 1909–1963 (2023)
1909
-
[30]
Sensors23(15), 6887 (2023)
Mirzaei, B., Nezamabadi-Pour, H., Raoof, A., Derakhshani, R.: Small object detection and tracking: A comprehensive review. Sensors23(15), 6887 (2023)
2023
-
[31]
Jour- nal of electrical and computer engineering2020(1), 3189691 (2020)
Nguyen, N.D., Do, T., Ngo, T.D., Le, D.D.: An evaluation of deep learning methods for small object detection. Jour- nal of electrical and computer engineering2020(1), 3189691 (2020)
2020
-
[32]
In: Proceedings of the IEEE/CVF international conference on computer vi- sion
Noh, J., Bae, W., Lee, W., Seo, J., Kim, G.: Better to follow, follow to be better: Towards precise supervision of feature super-resolution for small object detection. In: Proceedings of the IEEE/CVF international conference on computer vi- sion. pp. 9725–9734 (2019)
2019
-
[33]
DINOv2: Learning Robust Visual Features without Supervision
Oquab, M., Darcet, T., Moutakanni, T., V o, H., Szafraniec, M., Khalidov, V ., Fernandez, P., Haziza, D., Massa, F., El- Nouby, A., et al.: DINOv2: Learning robust visual fea- tures without supervision. arXiv preprint arXiv:2304.07193 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[34]
SAM 2: Segment Anything in Images and Videos
Ravi, N., Gabeur, V ., Hu, Y .T., Hu, R., Ryali, C., Ma, T., Khedr, H., R ¨adle, R., Rolland, C., Gustafson, L.e.a.: SAM 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[35]
In: Proceed- ings of the IEEE conference on computer vision and pattern recognition
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceed- ings of the IEEE conference on computer vision and pattern recognition. pp. 779–788 (2016)
2016
-
[36]
Advances in neural information processing systems 28(2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: To- wards real-time object detection with region proposal net- works. Advances in neural information processing systems 28(2015)
2015
-
[37]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: A metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 658–666 (2019)
2019
-
[38]
arXiv preprint arXiv:2103.14027 (2021)
Shinya, Y .: USB: Universal-scale object detection bench- mark. arXiv preprint arXiv:2103.14027 (2021)
-
[39]
Sim ´eoni, O., V o, H.V ., Seitzer, M., Baldassarre, F., Oquab, M., Jose, C., Khalidov, V ., Szafraniec, M., Yi, S., Ramamon- jisoa, M., et al.: DINOv3. arXiv preprint arXiv:2508.10104 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[40]
Complex Variables and Elliptic Equations24(3-4), 249–265 (1994)
Stanoyevitch, A., Stegenga, D.A.: The geometry of Poincar ´e disks. Complex Variables and Elliptic Equations24(3-4), 249–265 (1994)
1994
-
[41]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Sun, P., Kretzschmar, H., Dotiwalla, X., Chouard, A., Pat- naik, V ., Tsui, P., Guo, J., Zhou, Y ., Chai, Y ., Caine, B., et al.: Scalability in perception for autonomous driving: Waymo open dataset. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2446–2454 (2020)
2020
-
[42]
In: Proceedings of the IEEE/CVF international conference on computer vision
Thavamani, C., Li, M., Cebron, N., Ramanan, D.: FOVEA: Foveated image magnification for autonomous navigation. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 15539–15548 (2021)
2021
-
[43]
In: Proceedings of the IEEE/CVF international conference on computer vision
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: Fully convo- lutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 9627–9636 (2019)
2019
-
[44]
IEEE transactions on geo- science and remote sensing61, 1–15 (2022)
Wang, D., Zhang, Q., Xu, Y ., Zhang, J., Du, B., Tao, D., Zhang, L.: Advancing plain vision transformer toward re- mote sensing foundation model. IEEE transactions on geo- science and remote sensing61, 1–15 (2022)
2022
-
[45]
A normal- ized Gaussian Wasserstein distance for tiny object detection
Wang, J., Xu, C., Yang, W., Yu, L.: A normalized Gaussian Wasserstein distance for tiny object detection. arXiv preprint arXiv:2110.13389 (2021) 10
-
[46]
Neural Computing and Applications36(12), 6283–6303 (2024)
Wei, W., Cheng, Y ., He, J., Zhu, X.: A review of small ob- ject detection based on deep learning. Neural Computing and Applications36(12), 6283–6303 (2024)
2024
-
[47]
Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting
Wilson, B., Qi, W., Agarwal, T., Lambert, J., Singh, J., Khan- delwal, S., Pan, B., Kumar, R., Hartnett, A., Pontes, J.K., et al.: Argoverse 2: Next generation datasets for self-driving perception and forecasting. arXiv preprint arXiv:2301.00493 (2023)
work page internal anchor Pith review arXiv 2023
-
[48]
IEEE Intelligent Transportation Systems Magazine13(1), 91–106 (2020)
Wong, K., Gu, Y ., Kamijo, S.: Mapping for autonomous driving: Opportunities and challenges. IEEE Intelligent Transportation Systems Magazine13(1), 91–106 (2020)
2020
-
[49]
In: Proceed- ings of the IEEE conference on computer vision and pattern recognition
Xia, G.S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M., Pelillo, M., Zhang, L.: DOTA: A large-scale dataset for object detection in aerial images. In: Proceed- ings of the IEEE conference on computer vision and pattern recognition. pp. 3974–3983 (2018)
2018
-
[50]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition
Xia, Z., Pan, X., Song, S., Li, L.E., Huang, G.: Vision trans- former with deformable attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition. pp. 4794–4803 (2022)
2022
-
[51]
ISPRS Journal of Pho- togrammetry and Remote Sensing190, 79–93 (2022)
Xu, C., Wang, J., Yang, W., Yu, H., Yu, L., Xia, G.S.: De- tecting tiny objects in aerial images: A normalized Wasser- stein distance and a new benchmark. ISPRS Journal of Pho- togrammetry and Remote Sensing190, 79–93 (2022)
2022
-
[52]
In: European conference on computer vision
Xu, C., Wang, J., Yang, W., Yu, H., Yu, L., Xia, G.S.: Rfla: Gaussian receptive field based label assignment for tiny ob- ject detection. In: European conference on computer vision. pp. 526–543. Springer (2022)
2022
-
[53]
In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition
Yang, C., Huang, Z., Wang, N.: QueryDet: Cascaded sparse query for accelerating high-resolution small object detection. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. pp. 13668–13677 (2022)
2022
-
[54]
In: Proceedings of the IEEE/CVF winter conference on applications of computer vision
Yu, X., Gong, Y ., Jiang, N., Ye, Q., Han, Z.: Scale match for tiny person detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp. 1257–1265 (2020)
2020
-
[55]
IEEE access8, 58443–58469 (2020)
Yurtsever, E., Lambert, J., Carballo, A., Takeda, K.: A sur- vey of autonomous driving: Common practices and emerg- ing technologies. IEEE access8, 58443–58469 (2020)
2020
-
[56]
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., Ni, L.M., Shum, H.Y .: DINO: DETR with improved denoising anchor boxes for end-to-end object detection. arXiv preprint arXiv:2203.03605 (2022)
work page internal anchor Pith review arXiv 2022
-
[57]
In: 2024 IEEE 4th In- ternational Conference on Digital Twins and Parallel Intelli- gence (DTPI)
Zhao, Y ., Zhu, F., Mi, Y ., Chen, D., Xiong, G.: Simple-FPN: An image anomaly detection and localization network based on SimpleNet and feature pyramid. In: 2024 IEEE 4th In- ternational Conference on Digital Twins and Parallel Intelli- gence (DTPI). pp. 417–422. IEEE (2024)
2024
-
[58]
IEEE transactions on neural networks and learning systems30(11), 3212–3232 (2019)
Zhao, Z.Q., Zheng, P., Xu, S.t., Wu, X.: Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems30(11), 3212–3232 (2019)
2019
-
[59]
Zhou, X., Wang, D., Kr ¨ahenb¨uhl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)
-
[60]
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: De- formable DETR: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)
work page internal anchor Pith review arXiv 2010
-
[61]
Zou, Z., Chen, K., Shi, Z., Guo, Y ., Ye, J.: Object detection in 20 years: A survey. Proceedings of the IEEE111(3), 257– 276 (2023) 11 Appendix Section A reports details on image-based object distance es- timation as well as distance-based statistics and informa- tion regarding the Argoverse 2 [47] autonomous driving dataset. Section B provides an additi...
2023
-
[62]
Under the pinhole camera model, the object distancedcan be approximated as d≈ f Hc hp
For the TruckDrive dataset, the focal length isf= 3304. Under the pinhole camera model, the object distancedcan be approximated as d≈ f Hc hp . (6) Percentage of Objects In ImageDistribution of Object Ranges 24.9% 24.0%34.5% 16.5% 81.6% 14.7% 1.1% 2.6% 0-50m 50-150m 150-250m ≥250m Percentage of Pixels In Image 0-50m 50-150m 150-250m 83.2% 1.1% 15.8% 1.6% ...
2048
-
[63]
Ground truth annotations are shown on the left
are strong general object detectors, but perform worse in long an ultra-long range object detection. Ground truth annotations are shown on the left. Zoomed-in views corresponding to the red rectangles are provided to highlight detections at long and ultra-long range, where some objects reach up to1km. All baselines are fine-tuned on the TruckDrive [16] da...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.