Recognition: unknown
DinoRADE: Full Spectral Radar-Camera Fusion with Vision Foundation Model Features for Multi-class Object Detection in Adverse Weather
Pith reviewed 2026-05-10 17:22 UTC · model grok-4.3
The pith
DinoRADE fuses dense radar tensors with DINOv3 vision features via deformable cross-attention to improve multi-class object detection in adverse weather by 12.1 percent.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DinoRADE processes dense Radar tensors and aggregates vision features around transformed reference points in the camera perspective via deformable cross-attention, with vision features supplied by a DINOv3 Vision Foundation Model, yielding improved multi-class detection performance on the K-Radar dataset in all weather conditions and outperforming recent Radar-camera approaches by 12.1 percent.
What carries the argument
Deformable cross-attention that aggregates DINOv3 vision features around radar-transformed reference points projected into the camera view
If this is right
- The pipeline enables separate performance reporting for five object classes including vulnerable road users on an adverse-weather dataset.
- Radar-only limitations in fine spatial detail are mitigated by pulling in high-resolution vision features at radar reference locations.
- The 12.1 percent gain over prior radar-camera methods holds across all weather conditions in the K-Radar evaluation.
- Vision foundation model features can be incorporated into radar-centered detection without requiring full image processing at every step.
Where Pith is reading between the lines
- Similar deformable-attention fusion could be tested with other vision foundation models to check whether gains are specific to DINOv3.
- The method's reliance on accurate radar-to-camera projection suggests potential sensitivity to calibration drift in deployed vehicles.
- If the performance lift generalizes, it could reduce the required radar resolution or sensor count in production autonomous driving stacks.
- Extending the same reference-point mechanism to lidar-camera pairs might address low-visibility scenarios beyond radar.
Load-bearing premise
The K-Radar dataset distribution and the chosen reference-point transformation accurately represent real-world radar-camera calibration and adverse-weather statistics, and DINOv3 features transfer without significant domain gap to radar-projected image regions.
What would settle it
Evaluating DinoRADE on an independent radar-camera dataset collected in adverse weather and observing no improvement or a drop in mean average precision for the five object classes would falsify the claimed performance advantage.
Figures
read the original abstract
Reliable and weather-robust perception systems are essential for safe autonomous driving and typically employ multi-modal sensor configurations to achieve comprehensive environmental awareness. While recent automotive FMCW Radar-based approaches achieved remarkable performance on detection tasks in adverse weather conditions, they exhibited limitations in resolving fine-grained spatial details particularly critical for detecting smaller and vulnerable road users (VRUs). Furthermore, existing research has not adequately addressed VRU detection in adverse weather datasets such as K-Radar. We present DinoRADE, a Radar-centered detection pipeline that processes dense Radar tensors and aggregates vision features around transformed reference points in the camera perspective via deformable cross-attention. Vision features are provided by a DINOv3 Vision Foundation Model. We present a comprehensive performance evaluation on the K-Radar dataset in all weather conditions and are among the first to report detection performance individually for five object classes. Additionally, we compare our method with existing single-class detection approaches and outperform recent Radar-camera approaches by 12.1%. The code is available under https://github.com/chr-is-tof/RADE-Net.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces DinoRADE, a radar-centered detection pipeline that processes dense FMCW radar tensors and fuses them with features from a DINOv3 vision foundation model. Features are aggregated around radar-to-camera transformed reference points using deformable cross-attention. The method is evaluated on the K-Radar dataset across weather conditions, reporting multi-class results for five object categories (including VRUs) and claiming a 12.1% improvement over recent radar-camera fusion baselines.
Significance. If the reported gains are reproducible and attributable to the proposed fusion rather than dataset-specific factors, the work would usefully demonstrate how pre-trained vision foundation models can be integrated into radar-centric pipelines to improve spatial resolution for small objects in adverse weather. The emphasis on per-class metrics for five categories on K-Radar and the release of code are constructive contributions to the empirical literature on multi-modal adverse-weather perception.
major comments (3)
- [Abstract and §4] Abstract and §4 (Experiments): The central claim of a 12.1% outperformance is stated without the underlying metric (mAP, AP@0.5, etc.), the numerical scores of the compared radar-camera baselines, or any ablation isolating the DINOv3 deformable-attention component from the radar-only backbone. This absence prevents verification that the gain stems from the claimed full-spectral fusion rather than implementation details or dataset tuning.
- [§3] §3 (Method): The architecture description contains no domain-adaptation layer, weather-conditioned normalization, or explicit handling of the domain gap between DINOv3’s clear-weather pre-training distribution and the fog/rain/snow subsets of K-Radar. The deformable cross-attention simply consumes whatever features DINOv3 produces on the projected regions; if those features degrade substantially, the reported multi-modal benefit may be overstated.
- [§4] §4 (Experiments): No per-weather-condition breakdowns, error analysis, or statistical significance tests are referenced for the five-class results. Without these, it is impossible to determine whether the method’s advantage holds uniformly across adverse conditions or is driven by easier subsets.
minor comments (2)
- [Abstract] The abstract and introduction would benefit from a concise table or sentence listing the exact prior radar-camera methods being compared and their reported scores on the same K-Radar split.
- [§3] Notation for the radar tensor representation and the reference-point transformation could be made more explicit (e.g., coordinate frames and calibration parameters) to aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating where we will revise the manuscript to improve clarity and completeness.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (Experiments): The central claim of a 12.1% outperformance is stated without the underlying metric (mAP, AP@0.5, etc.), the numerical scores of the compared radar-camera baselines, or any ablation isolating the DINOv3 deformable-attention component from the radar-only backbone. This absence prevents verification that the gain stems from the claimed full-spectral fusion rather than implementation details or dataset tuning.
Authors: We agree that the metric, baseline scores, and an isolating ablation are necessary for full verification. The 12.1% figure refers to the improvement in mAP at IoU=0.5 over the strongest radar-camera baseline on the full K-Radar test set. We will revise the abstract and add a results table in §4 that lists exact mAP scores for all compared methods. We will also insert an ablation subsection in §4 that removes the DINOv3 deformable-attention branch and reports the resulting drop relative to the full model. revision: yes
-
Referee: [§3] §3 (Method): The architecture description contains no domain-adaptation layer, weather-conditioned normalization, or explicit handling of the domain gap between DINOv3’s clear-weather pre-training distribution and the fog/rain/snow subsets of K-Radar. The deformable cross-attention simply consumes whatever features DINOv3 produces on the projected regions; if those features degrade substantially, the reported multi-modal benefit may be overstated.
Authors: We acknowledge the domain-shift issue. Our current design freezes DINOv3 and applies no explicit adaptation or weather-conditioned normalization, relying on the foundation model’s reported robustness. We will expand §3 with a dedicated paragraph discussing the pre-training versus K-Radar distribution gap and its potential impact on feature quality. We will also add a short qualitative study of DINOv3 feature activation maps on adverse-weather images to the supplementary material. revision: partial
-
Referee: [§4] §4 (Experiments): No per-weather-condition breakdowns, error analysis, or statistical significance tests are referenced for the five-class results. Without these, it is impossible to determine whether the method’s advantage holds uniformly across adverse conditions or is driven by easier subsets.
Authors: We agree that condition-specific breakdowns strengthen the claims. K-Radar provides weather labels, so we will add a new table in §4 reporting mAP per weather subset (clear, fog, rain, snow) for the five classes. We will also include a brief error analysis highlighting common failure modes for VRUs and small objects, and report standard deviations across three random seeds to indicate variability. revision: yes
Circularity Check
Empirical pipeline with no derivations or predictions by construction
full rationale
The paper presents DinoRADE as an architecture (dense radar tensor processing + deformable cross-attention to aggregate DINOv3 features) and reports empirical mAP gains on the K-Radar dataset. No equations, first-principles derivations, fitted parameters renamed as predictions, or uniqueness theorems appear in the provided text. Performance claims rest on direct experimental comparison rather than any reduction to self-defined inputs. Self-citations, if present, are not load-bearing for any claimed derivation.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gi- ancarlo Baldan, and Oscar Beijbom
Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gi- ancarlo Baldan, and Oscar Beijbom. nuScenes: A Multi- modal Dataset for Autonomous Driving . In2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11618–11628, Los Alamitos, CA, USA,
- [3]
-
[4]
V oxel r-cnn: Towards high performance voxel-based 3d object detection.Proceed- ings of the AAAI Conference on Artificial Intelligence, 35: 1201–1209, 2021
Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, and Houqiang Li. V oxel r-cnn: Towards high performance voxel-based 3d object detection.Proceed- ings of the AAAI Conference on Artificial Intelligence, 35: 1201–1209, 2021
2021
-
[5]
A review of research on vehicle detection in adverse weather environments.Journal of Traffic and Trans- portation Engineering (English Edition), 12(5):1452–1483, 2025
Sheng Feng, Xueying Cai, Limin Li, Weixing Wang, and Senang Ying. A review of research on vehicle detection in adverse weather environments.Journal of Traffic and Trans- portation Engineering (English Edition), 12(5):1452–1483, 2025. 8 Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026
2025
-
[6]
F. Fent, A. Palffy, and H. Caesar. Dpft: Dual perspec- tive fusion transformer for camera-radar-based object de- tection.IEEE Transactions on Intelligent Vehicles, 10(11): 4929–4941, 2025
2025
-
[7]
Raw ADC data of 77GHz MMWave radar for au- tomotive object detection, 2022
Xiangyu Gao, Youchen Luo, Guanbin Xing, Sumit Roy, and Hui Liu. Raw ADC data of 77GHz MMWave radar for au- tomotive object detection, 2022. Distributed by IEEE Data- port
2022
-
[8]
Are we ready for autonomous driving? the kitti vision benchmark suite
Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the kitti vision benchmark suite. In2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 3354–3361, 2012
2012
-
[9]
T- fftradnet: Object detection with swin vision transformers from raw adc radar signals
James Giroux, Martin Bouchard, and Robert Lagani `ere. T- fftradnet: Object detection with swin vision transformers from raw adc radar signals. In2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pages 4032–4041, 2023
2023
-
[10]
Rich feature hierarchies for accurate object detection and semantic segmentation
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 580–587, 2014
2014
-
[11]
Wavelet-based multi-view fu- sion of 4d radar tensor and camera for robust 3d object de- tection, 2026
Runwei Guan, Jianan Liu, Shaofeng Liang, Fangqiang Ding, Shanliang Yao, Xiaokai Bai, Daizong Liu, Tao Huang, Guo- qiang Mao, and Hui Xiong. Wavelet-based multi-view fu- sion of 4d radar tensor and camera for robust 3d object de- tection, 2026
2026
-
[12]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceed- ings of the IEEE Conference on Computer Vision and Pat- tern Recognition (CVPR), 2016
2016
-
[13]
Junjie Huang, Guan Huang, Zheng Zhu, Yun Ye, and Dalong Du. BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View, 2022. arXiv:2112.11790 [cs]
-
[14]
L4dr: Lidar-4dradar fusion for weather-robust 3d object detection.Proceedings of the AAAI Conference on Artificial Intelligence, 39(4):3806–3814, 2025
Xun Huang, Ziyu Xu, Hai Wu, Jinlong Wang, Qiming Xia, Yan Xia, Jonathan Li, Kyle Gao, Chenglu Wen, and Cheng Wang. L4dr: Lidar-4dradar fusion for weather-robust 3d object detection.Proceedings of the AAAI Conference on Artificial Intelligence, 39(4):3806–3814, 2025
2025
-
[15]
3d object detection for autonomous driving: A survey
JunXin Jin, Wei Liu, Zuotao Ning, Qixi Zhao, Shuai Cheng, and Jun Hu. 3d object detection for autonomous driving: A survey. In2024 36th Chinese Control and Decision Confer- ence (CCDC), pages 3825–3832, 2024
2024
-
[16]
RTNH+: Enhanced 4D Radar Object Detection Network Using Two-Level Preprocessing and Vertical Encoding
Seung-Hyun Kong, Dong-Hee Paek, and Sangyeong Lee. RTNH+: Enhanced 4D Radar Object Detection Network Using Two-Level Preprocessing and Vertical Encoding. IEEE Transactions on Intelligent Vehicles, 10(2):1427– 1440, 2025
2025
- [17]
-
[18]
RADE-Net: Robust Attention Net- work for Radar-only Object Detection in Adverse Weather, 2026
Christof Leitgeb, Thomas Puchleitner, Max Peter Ronecker, and Daniel Watzenig. RADE-Net: Robust Attention Net- work for Radar-only Object Detection in Adverse Weather, 2026
2026
-
[19]
Zhiqi Li, Wenhai Wang, Hongyang Li, Enze Xie, Chong- hao Sima, Tong Lu, Qiao Yu, and Jifeng Dai. Bevformer: Learning bird’s-eye-view representation from lidar-camera via spatiotemporal transformers.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 47(3):2020–2036, 2025
2020
-
[20]
Radical: A synchronized fmcw radar, depth, imu and rgb camera data dataset with low-level fmcw radar signals.IEEE Journal of Selected Topics in Signal Processing, PP:1–1, 2021
Teck Yian Lim, Spencer Markowitz, and Minh Do. Radical: A synchronized fmcw radar, depth, imu and rgb camera data dataset with low-level fmcw radar signals.IEEE Journal of Selected Topics in Signal Processing, PP:1–1, 2021
2021
-
[21]
RCBEVDet: Radar-Camera Fusion in Bird’s Eye View for 3D Object Detection
Zhiwei Lin, Zhe Liu, Zhongyu Xia, Xinhao Wang, Yongtao Wang, Shengxiang Qi, Yang Dong, Nan Dong, Le Zhang, and Ce Zhu. RCBEVDet: Radar-Camera Fusion in Bird’s Eye View for 3D Object Detection . In2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14928–14937, Los Alamitos, CA, USA,
-
[22]
IEEE Computer Society
-
[23]
Echoes beyond points: Unleashing the power of raw radar data in multi-modality fusion
Yang Liu, Feng Wang, Naiyan Wang, and ZHAO-XIANG ZHANG. Echoes beyond points: Unleashing the power of raw radar data in multi-modality fusion. InAdvances in Neu- ral Information Processing Systems, pages 53964–53982. Curran Associates, Inc., 2023
2023
-
[24]
Zimmer-Dauphinee, Jor- dan M
Siqi Lu, Junlin Guo, James R. Zimmer-Dauphinee, Jor- dan M. Nieusma, Xiao Wang, Parker VanValkenburgh, Steven A. Wernke, and Yuankai Huo. Vision foundation models in remote sensing: A survey.IEEE Geoscience and Remote Sensing Magazine, 13(3):190–215, 2025
2025
-
[25]
3d object detection from images for autonomous driv- ing: A survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(5):3537–3556, 2024
Xinzhu Ma, Wanli Ouyang, Andrea Simonelli, and Elisa Ricci. 3d object detection from images for autonomous driv- ing: A survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(5):3537–3556, 2024
2024
-
[26]
Automotive radar dataset for deep learning based 3d object detection
Michael Meyer and Georg Kuschk. Automotive radar dataset for deep learning based 3d object detection. In2019 16th European Radar Conference (EuRAD), pages 129–132, 2019
2019
-
[27]
Radarpillars: Efficient object detec- tion from 4d radar point clouds
Alexander Musiat, Laurenz Reichardt, Michael Schulze, and Oliver Wasenm¨uller. Radarpillars: Efficient object detec- tion from 4d radar point clouds. In2024 IEEE 27th Inter- national Conference on Intelligent Transportation Systems (ITSC), pages 1656–1663, 2024
2024
-
[28]
Maxime Oquab, Timoth ´ee Darcet, Th ´eo Moutakanni, Huy V . V o, Marc Szafraniec, Vasil Khalidov, Pierre Fer- nandez, Daniel Haziza, Francisco Massa, Alaaeldin El- Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russ Howes, Po-Yao (Bernie) Huang, Shang-Wen Li, Is- han Misra, Michael G. Rabbat, Vasu Sharma, Gabriel Syn- naeve, Hu Xu, Herv ´e J ´ego...
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[29]
Carrada dataset: Camera and au- tomotive radar with range- angle- doppler annotations
Arthur Ouaknine, Alasdair Newson, Julien Rebut, Florence Tupin, and Patrick P´erez. Carrada dataset: Camera and au- tomotive radar with range- angle- doppler annotations. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 5068–5075, 2021
2020
-
[30]
K-radar: 4d radar object detection for autonomous driving in various weather conditions
Dong-Hee Paek, Seung-Hyun Kong, and Kevin Tirta Wi- jaya. K-radar: 4d radar object detection for autonomous driving in various weather conditions. InThirty-sixth Con- ference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022. 9 Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026
2022
-
[31]
Andras Palffy, Ewoud Pool, Srimannarayana Baratam, Ju- lian F. P. Kooij, and Dariu M. Gavrila. Multi-class road user detection with 3+1d radar in the view-of-delft dataset.IEEE Robotics and Automation Letters, 7(2):4961–4968, 2022
2022
-
[32]
Raw High-Definition Radar for Multi-Task Learn- ing
Julien Rebut, Arthur Ouaknine, Waqas Malik, and Patrick Perez. Raw High-Definition Radar for Multi-Task Learn- ing . In2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 17000–17009, Los Alamitos, CA, USA, 2022. IEEE Computer Society
2022
-
[33]
Raw high-definition radar for multi-task learning
Julien Rebut, Arthur Ouaknine, Waqas Malik, and Patrick P´erez. Raw high-definition radar for multi-task learning. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), pages 17021–17030, 2022
2022
-
[34]
You only look once: Unified, real-time object de- tection
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object de- tection. In2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 779–788, 2016
2016
-
[35]
Bevcar: Camera-radar fusion for bev map and ob- ject segmentation
Jonas Schramm, Niclas V ¨odisch, K ¨ursat Petek, B Ravi Ki- ran, Senthil Yogamani, Wolfram Burgard, and Abhinav Val- ada. Bevcar: Camera-radar fusion for bev map and ob- ject segmentation. In2024 IEEE/RSJ International Confer- ence on Intelligent Robots and Systems (IROS), pages 1435– 1442, 2024
2024
-
[36]
Classification of human activities based on automotive radar spectral images using machine learning techniques: A case study
Linda Senigagliesi, Gianluca Ciattaglia, Deivis Disha, and Ennio Gambi. Classification of human activities based on automotive radar spectral images using machine learning techniques: A case study. In2022 IEEE Radar Conference (RadarConf22), pages 1–6, 2022
2022
-
[37]
Real-time 3d scene understanding for road safety: Depth es- timation and object detection for autonomous vehicle aware- ness.Vehicles, 8(2), 2026
Marcel Simeonov, Andrei Kurdiumov, and Milan Dado. Real-time 3d scene understanding for road safety: Depth es- timation and object detection for autonomous vehicle aware- ness.Vehicles, 8(2), 2026
2026
-
[38]
Oriane Sim ´eoni, Huy V . V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Micha ¨el Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timoth´ee Darcet, Th´eo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie,...
2025
-
[39]
Enhanced 3d object detection via diverse feature representations of 4d radar tensor.IEEE Sensors Journal, pages 1–1, 2026
Seung-Hyun Song, Dong-Hee Paek, Minh-Quan Dao, Ezio Malis, and Seung-Hyun Kong. Enhanced 3d object detection via diverse feature representations of 4d radar tensor.IEEE Sensors Journal, pages 1–1, 2026
2026
-
[40]
Learning 3d fea- tures with 2d cnns via surface projection for ct volume seg- mentation
Youyi Song, Zhen Yu, Teng Zhou, Jeremy Yuen-Chun Teoh, Baiying Lei, Kup-Sze Choi, and Jing Qin. Learning 3d fea- tures with 2d cnns via surface projection for ct volume seg- mentation. InMedical Image Computing and Computer As- sisted Intervention – MICCAI 2020, pages 176–186, Cham,
2020
-
[41]
Springer International Publishing
-
[42]
Efficient 4d radar data auto-labeling method using lidar-based object detection network
Min-Hyeok Sun, Dong-Hee Paek, Seung-Hyun Song, and Seung-Hyun Kong. Efficient 4d radar data auto-labeling method using lidar-based object detection network. In2024 IEEE Intelligent Vehicles Symposium (IV), pages 2616– 2621, 2024
2024
-
[43]
Weinberger
Yan Wang, Wei-Lun Chao, Divyansh Garg, Bharath Hari- haran, Mark Campbell, and Kilian Q. Weinberger. Pseudo- lidar from visual depth estimation: Bridging the gap in 3d object detection for autonomous driving. In2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8437–8445, 2019
2019
-
[44]
Casa: A cascade attention network for 3-d object detection from lidar point clouds.IEEE Transactions on Geoscience and Remote Sensing, 60:1–11, 2022
Hai Wu, Jinhao Deng, Chenglu Wen, Xin Li, Cheng Wang, and Jonathan Li. Casa: A cascade attention network for 3-d object detection from lidar point clouds.IEEE Transactions on Geoscience and Remote Sensing, 60:1–11, 2022
2022
-
[45]
Transformation-equivariant 3d object detec- tion for autonomous driving.Proceedings of the AAAI Con- ference on Artificial Intelligence, 37:2795–2802, 2023
Hai Wu, Chenglu Wen, Wei Li, Xin Li, Ruigang Yang, and Cheng Wang. Transformation-equivariant 3d object detec- tion for autonomous driving.Proceedings of the AAAI Con- ference on Artificial Intelligence, 37:2795–2802, 2023
2023
-
[46]
Radar–camera fusion in perspective view and bird’s eye view for 3d object detection.Sensors, 25(19), 2025
Yuhao Xiao, Xiaoqing Chen, Yingkai Wang, and Zhongliang Fu. Radar–camera fusion in perspective view and bird’s eye view for 3d object detection.Sensors, 25(19), 2025
2025
-
[47]
ADCNet: Learning from Raw Radar Data via Distillation,
Bo Yang, Ishan Khatri, Michael Happold, and Chulong Chen. ADCNet: Learning from Raw Radar Data via Dis- tillation, 2023. arXiv:2303.11420 [eess]
-
[48]
Xue Yang, Junchi Yan, Qi Ming, Wentao Wang, Xi- aopeng Zhang, and Qi Tian. Rethinking rotated object detection with gaussian wasserstein distance loss.CoRR, abs/2101.11952, 2021
-
[49]
Radar-camera fusion for object detection and semantic segmentation in au- tonomous driving: A comprehensive review.IEEE Transac- tions on Intelligent Vehicles, 9(1):2094–2128, 2024
Shanliang Yao, Runwei Guan, Xiaoyu Huang, Zhuoxiao Li, Xiangyu Sha, Yong Yue, Eng Gee Lim, Hyungjoon Seo, Ka Lok Man, Xiaohui Zhu, and Yutao Yue. Radar-camera fusion for object detection and semantic segmentation in au- tonomous driving: A comprehensive review.IEEE Transac- tions on Intelligent Vehicles, 9(1):2094–2128, 2024
2094
-
[50]
Exploring radar data representations in au- tonomous driving: A comprehensive review.IEEE Trans- actions on Intelligent Transportation Systems, 26(6):7401– 7425, 2025
Shanliang Yao, Runwei Guan, Zitian Peng, Chenhang Xu, Yilu Shi, Weiping Ding, Eng Gee Lim, Yong Yue, Hyungjoon Seo, Ka Lok Man, Jieming Ma, Xiaohui Zhu, and Yutao Yue. Exploring radar data representations in au- tonomous driving: A comprehensive review.IEEE Trans- actions on Intelligent Transportation Systems, 26(6):7401– 7425, 2025
2025
-
[51]
Raddet: Range-azimuth-doppler based radar object detec- tion for dynamic road users
Ao Zhang, Farzan Erlik Nowruzi, and Robert Laganiere. Raddet: Range-azimuth-doppler based radar object detec- tion for dynamic road users. In2021 18th Conference on Robots and Vision (CRV), pages 95–102, 2021
2021
-
[52]
Mixedfusion: An efficient multimodal data fusion framework for 3-d object detection and tracking
Cheng Zhang, Hai Wang, Long Chen, Yicheng Li, and Yingfeng Cai. Mixedfusion: An efficient multimodal data fusion framework for 3-d object detection and tracking. IEEE Transactions on Neural Networks and Learning Sys- tems, 36(1):1842–1856, 2025
2025
-
[53]
Xingyi Zhou, Dequan Wang, and Philipp Kr ¨ahenb¨uhl. Ob- jects as points. InarXiv preprint arXiv:1904.07850, 2019
-
[54]
Vpfnet: Improving 3d object detection with virtual point based lidar and stereo data fusion.IEEE Transactions on Multimedia, 25:5291–5304, 2023
Hanqi Zhu, Jiajun Deng, Yu Zhang, Jianmin Ji, Qiuyu Mao, Houqiang Li, and Yanyong Zhang. Vpfnet: Improving 3d object detection with virtual point based lidar and stereo data fusion.IEEE Transactions on Multimedia, 25:5291–5304, 2023
2023
-
[55]
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. Deformable detr: Deformable transform- ers for end-to-end object detection.ArXiv, abs/2010.04159, 2020. 10
work page internal anchor Pith review arXiv 2010
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.