Scene Reconstruction as Mapping Priors for 3D Detection
Pith reviewed 2026-05-25 05:41 UTC · model grok-4.3
The pith
Reconstructed scene maps from aggregated sensors serve as priors that improve 3D object detection by resolving ambiguities in sparse or noisy data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Automatically constructed dense mapping priors, derived from aggregated sensor data, can be integrated with sensor modalities inside the MPA3D framework to produce state-of-the-art 3D detection performance on the Waymo Open Dataset by supplying static environmental structure that resolves sensor ambiguities.
What carries the argument
The Mapping Priors Augmented 3D Detection (MPA3D) framework that fuses reconstructed scene priors with different sensor inputs.
If this is right
- Detection accuracy rises for distant objects where sensor returns are sparse.
- Performance holds up better under adverse weather that degrades raw sensor quality.
- Large-scale deployment no longer requires manual creation or maintenance of HD maps.
- The same reconstructed priors can be reused across multiple perception modules without extra labeling cost.
Where Pith is reading between the lines
- The approach could be tested on other autonomous-driving datasets to check whether the prior benefit generalizes beyond Waymo.
- If priors are updated online from recent drives, detection might adapt to slow environmental changes such as construction.
- Combining the priors with online mapping systems might reduce the need for high-resolution sensors in some regimes.
Load-bearing premise
Dense mapping priors built automatically from sensor data can be added to detection models without introducing new errors or biases that cancel the intended gains.
What would settle it
Run the identical detector on the same Waymo validation scenes once with the mapping priors and once without them; if the mAP or range-specific metrics show no consistent improvement when priors are present, the central claim fails.
Figures
read the original abstract
In autonomous driving, mapping is critical for motion planning but remains an under-utilized resource for perception tasks such as 3D object detection. Maps can provide robust structural priors of the static environment, helping resolve ambiguities and correct for sensor data sparsity or noise, especially for distant objects or under adverse weather conditions. However, conventional High-Definition (HD) maps are resource-intensive to obtain and maintain, which presents a challenge for efficient, large-scale deployment. In this paper, we propose a scalable solution to systematically leverage mapping to improve 3D detection by overcoming two primary challenges. First, we introduce a pipeline to automatically build dense mapping priors from aggregated sensor data, eliminating the need for human labeling. Second, we design a novel Mapping Priors Augmented 3D Detection (MPA3D) framework to effectively integrate mapping priors with different sensor modalities. Extensive experiments on the Waymo Open Dataset demonstrate that our approach achieves new state-of-the-art results, proving the effectiveness of scalable reconstructed scene priors for enhancing 3D detection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a pipeline to automatically build dense mapping priors from aggregated sensor data without human labeling, and introduces the Mapping Priors Augmented 3D Detection (MPA3D) framework to integrate these priors with sensor modalities. It claims this overcomes challenges with conventional HD maps and achieves new state-of-the-art results on the Waymo Open Dataset for 3D object detection.
Significance. If the central claims were substantiated with verifiable experiments, the work could have significance for scalable perception in autonomous driving by reducing dependence on labor-intensive HD maps and improving robustness under sensor sparsity or adverse conditions. However, the provided abstract contains no technical details, equations, ablations, or quantitative evidence, so significance cannot be assessed.
major comments (1)
- [Abstract] Abstract: the claim that the approach 'achieves new state-of-the-art results' on Waymo supplies no methods, ablation studies, error bars, dataset details, or quantitative tables, so the data cannot be verified to support the claim as stated.
Simulated Author's Rebuttal
We thank the referee for their comments. Below we address the major comment point by point.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that the approach 'achieves new state-of-the-art results' on Waymo supplies no methods, ablation studies, error bars, dataset details, or quantitative tables, so the data cannot be verified to support the claim as stated.
Authors: The abstract is intentionally concise and summarizes the key contribution and outcome. The full manuscript provides the requested details: Section 3 describes the MPA3D framework and mapping-prior construction pipeline with equations; Section 4.1 specifies the Waymo Open Dataset splits and evaluation protocol; Section 4.2 presents quantitative tables comparing against prior methods with mAP and mAPH metrics; Section 4.3 contains ablation studies; and error bars are reported where statistical variation is assessed. These sections directly substantiate the state-of-the-art claim. revision: no
Circularity Check
No significant circularity; no derivation chain present to inspect
full rationale
The supplied abstract and manuscript placeholder contain no equations, parameter fits, self-citations, or claimed derivations of any kind. Without visible technical steps that could reduce to inputs by construction, none of the enumerated circularity patterns can be exhibited. The paper's central claims are empirical SOTA results on Waymo; these are not shown to be forced by any internal definition or self-referential loop. This is the normal honest finding when no load-bearing mathematical content is supplied.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Mad: Memory-augmented detection of 3d objects
Ben Agro, Sergio Casas, Patrick Wang, Thomas Gilles, and Raquel Urtasun. Mad: Memory-augmented detection of 3d objects. In2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1449–1460, 2025. 3, 7
work page 2025
-
[2]
Bevmap: Map-aware bev modeling for 3d perception
Mincheol Chang, Seokha Moon, Reza Mahjourian, and Jinkyu Kim. Bevmap: Map-aware bev modeling for 3d perception. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 7419–7428, 2024. 2
work page 2024
-
[3]
Maps for autonomous driving: Full-process survey and frontiers.arXiv preprint arXiv:2509.12632, 2025
Pengxin Chen, Zhipeng Luo, Xiaoqi Jiang, Zhangcai Yin, and Jonathan Li. Maps for autonomous driving: Full-process survey and frontiers.arXiv preprint arXiv:2509.12632, 2025. 2
-
[4]
Mppnet: Multi-frame feature intertwining with proxy points for 3d temporal object detection
Xuesong Chen, Shaoshuai Shi, Benjin Zhu, Ka Chun Cheung, Hang Xu, and Hongsheng Li. Mppnet: Multi-frame feature intertwining with proxy points for 3d temporal object detection. InEuropean Conference on Computer Vision, pages 680–697. Springer, 2022. 3, 7
work page 2022
-
[5]
Point transformer.IEEE Access, 2021
Nico Engel, Vasileios Belagiannis, and Klaus Dietmayer. Point transformer.IEEE Access, 2021. 2
work page 2021
-
[6]
Embracing Single Stride 3D Object Detector with Sparse Transformer
Lue Fan, Ziqi Pang, Tianyuan Zhang, Yu-Xiong Wang, Hang Zhao, Feng Wang, Naiyan Wang, and Zhaoxiang Zhang. Embracing Single Stride 3D Object Detector with Sparse Transformer. InCVPR, 2022. 6
work page 2022
-
[7]
Fully Sparse 3D Object Detection
Lue Fan, Feng Wang, Naiyan Wang, and Zhaoxiang Zhang. Fully Sparse 3D Object Detection. InNeurIPS, 2022. 6, 7
work page 2022
-
[8]
Lue Fan, Feng Wang, Naiyan Wang, and Zhaoxiang Zhang. Fsd v2: Improving fully sparse 3d object detection with virtual voxels.arXiv preprint arXiv:2308.03755, 2023. 6
-
[9]
Strobe: Streaming object detection from lidar packets
Davi Frossard, Shun Da Suo, Sergio Casas, James Tu, and Raquel Urtasun. Strobe: Streaming object detection from lidar packets. InConference on Robot Learning, pages 1174–1183. PMLR, 2021. 2
work page 2021
-
[10]
Chunrui Han, Jinrong Yang, Jianjian Sun, Zheng Ge, Runpei Dong, Hongyu Zhou, Weixin Mao, Yuang Peng, and Xiangyu Zhang. Exploring recurrent long-term temporal fusion for multi-view 3d perception.IEEE Robotics and Automation Letters, 9(7):6544–6551, 2024. 2
work page 2024
-
[11]
Msf: Motion-guided sequential fusion for efficient 3d object detection from point cloud sequences
Chenhang He, Ruihuang Li, Yabin Zhang, Shuai Li, and Lei Zhang. Msf: Motion-guided sequential fusion for efficient 3d object detection from point cloud sequences. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5196–5205, 2023. 3, 7
work page 2023
-
[12]
Lef: Late-to-early temporal fusion for lidar 3d object detection
Tong He, Pei Sun, Zhaoqi Leng, Chenxi Liu, Dragomir Anguelov, and Mingxing Tan. Lef: Late-to-early temporal fusion for lidar 3d object detection. In2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1637–1644. IEEE, 2023. 2, 7
work page 2023
-
[13]
Gaussian Error Linear Units (GELUs)
D Hendrycks. Gaussian error linear units (gelus).arXiv preprint arXiv:1606.08415, 2016. 5
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[14]
Jinghua Hou, Zhe Liu, Zhikang Zou, Xiaoqing Ye, Xiang Bai, et al. Query-based temporal fusion with explicit motion for 3d object detection.Advances in Neural Information Processing Systems, 36:75782–75797, 2023. 3
work page 2023
-
[15]
Vadet: Multi-frame lidar 3d object detection using variable aggregation
Chengjie Huang, Vahdat Abdelzad, Sean Sedwards, and Krzysztof Czarnecki. Vadet: Multi-frame lidar 3d object detection using variable aggregation. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 711–720. IEEE, 2025. 7
work page 2025
-
[16]
Ptt: Point-trajectory transformer for efficient temporal 3d object detection
Kuan-Chih Huang, Weijie Lyu, Ming-Hsuan Yang, and Yi-Hsuan Tsai. Ptt: Point-trajectory transformer for efficient temporal 3d object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14938–14947, 2024. 7
work page 2024
-
[17]
3d gaussian splatting for real-time radiance field rendering.ACM Trans
Bernhard Kerbl, Georgios Kopanas, Thomas Leimk¨uhler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1, 2023. 2, 4
work page 2023
-
[18]
Bernhard Kerbl, Andreas Meuleman, Georgios Kopanas, Michael Wimmer, Alexandre Lanvin, and George Drettakis. A hierarchical 3d gaussian representation for real-time rendering of very large datasets.ACM Transactions on Graphics (TOG), 43(4):1–15, 2024
work page 2024
-
[19]
Shakiba Kheradmand, Daniel Rebain, Gopal Sharma, Weiwei Sun, Yang-Che Tseng, Hossam Isack, Abhishek Kar, Andrea Tagliasacchi, and Kwang Moo Yi. 3d gaussian splatting as markov chain monte carlo.Advances in Neural Information Processing Systems, 37:80965–80986, 2024. 2
work page 2024
-
[20]
Joint 3d proposal generation and object detection from view aggregation
Jason Ku, Melissa Mozifian, Jungwook Lee, Ali Harakeh, and Steven L Waslander. Joint 3d proposal generation and object detection from view aggregation. In2018 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 1–8. IEEE, 2018. 2
work page 2018
-
[21]
Lang, Sourabh V ora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom
Alex H. Lang, Sourabh V ora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. Pointpillars: Fast encoders for object detection from point clouds. InCVPR, 2019. 2, 6
work page 2019
-
[22]
Zhaoqi Leng, Guowang Li, Chenxi Liu, Ekin Dogus Cubuk, Pei Sun, Tong He, Dragomir Anguelov, and Mingxing Tan. Lidaraugment: Searching for scalable 3d lidar data augmentations.arXiv preprint arXiv:2210.13488, 2022. 6
-
[23]
3d fully convolutional network for vehicle detection in point cloud
Bo Li. 3d fully convolutional network for vehicle detection in point cloud. In2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1513–1518. IEEE, 2017. 2
work page 2017
-
[24]
Logonet: Towards accurate 3d object detection with local-to-global cross-modal fusion
Xin Li, Tao Ma, Yuenan Hou, Botian Shi, Yuchen Yang, Youquan Liu, Xingjiao Wu, Qin Chen, Yikang Li, Yu Qiao, et al. Logonet: Towards accurate 3d object detection with local-to-global cross-modal fusion. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 17524–17534, 2023. 7
work page 2023
-
[25]
Hdmapnet: An online hd map construction and evaluation framework
Yicheng Li, Qi Li, Tian Guo, Li Wang, Yu Wang, Qinhong Zhang, Yi Ding, Yingfeng Zhang, and Liangjun Zheng. Hdmapnet: An online hd map construction and evaluation framework. InIEEE International Conference on Robotics and Automation (ICRA), pages 4628–4634, 2022. 2, 3
work page 2022
-
[26]
Modar: Using motion forecasting for 3d object detection in point cloud sequences
Yingwei Li, Charles R Qi, Yin Zhou, Chenxi Liu, and Dragomir Anguelov. Modar: Using motion forecasting for 3d object detection in point cloud sequences. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9329–9339, 2023. 3, 7, 1
work page 2023
-
[27]
Stellar: Scaling 3d perception large models for autonomous driving,
Yingwei Li, Xin Huang, Yang Liu, Yang Fu, Alex Zihao Zhu, Chen Song, Junwen Yao, Anant Subramanian, Hao Xiang, Weijing Shi, Yuliang Zou, Tom Hoddes, Zhaoqi Leng, Govind Thattai, Dragomir Anguelov, and Mingxing Tan. Stellar: Scaling 3d perception large models for autonomous driving,
-
[28]
Zhiqi Li, Wenhai Wang, Hongyang Li, Enze Xie, Chonghao Sima, Tong Lu, Qiao Yu, and Jifeng Dai. Bevformer: learning bird’s-eye-view representation from lidar-camera via spatiotemporal transformers.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024. 2
work page 2024
-
[29]
Pnpnet: End-to-end perception and prediction with tracking in the loop
Ming Liang, Bin Yang, Wenyuan Zeng, Yun Chen, Rui Hu, Sergio Casas, and Raquel Urtasun. Pnpnet: End-to-end perception and prediction with tracking in the loop. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11553–11562, 2020. 3
work page 2020
-
[30]
Maptr: Structured modeling and learning for online vectorized hd map construction
Yihan Liao, Yicheng Li, Yinghong Chen, Qinhong Zhang, and Li Zhang. Maptr: Structured modeling and learning for online vectorized hd map construction. InInternational Conference on Learning Representations (ICLR), 2023. arXiv preprint arXiv:2303.12574. 2, 3
-
[31]
Focal loss for dense object detection
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Doll ´ar. Focal loss for dense object detection. InProceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017. 5
work page 2017
-
[32]
Vectormapnet: End-to-end vectorized hd map learning
Jing Liu, Zheng Ding, Tianqi Zhang, Jiaqi Chen, and Jifeng Zhang. Vectormapnet: End-to-end vectorized hd map learning. InInternational Conference on Machine Learning (ICML), 2023. arXiv preprint arXiv:2303.08785. 2, 3
-
[33]
Zhijian Liu, Haotian Tang, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela Rus, and Song Han. Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation.arXiv preprint arXiv:2205.13542, 2022. 4, 7
-
[34]
Zhe Liu, Jinghua Hou, Xinyu Wang, Xiaoqing Ye, Jingdong Wang, Hengshuang Zhao, and Xiang Bai. Lion: Linear group rnn for 3d object detection in point clouds.Advances in Neural Information Processing Systems, 37:13601–13626,
-
[35]
Seed: A simple and effective 3d detr in point clouds
Zhe Liu, Jinghua Hou, Xiaoqing Ye, Tong Wang, Jingdong Wang, and Xiang Bai. Seed: A simple and effective 3d detr in point clouds. InEuropean Conference on Computer Vision, pages 110–126. Springer, 2024. 7
work page 2024
-
[36]
Rethinking network design and local geometry in point cloud: A simple resid- ual mlp framework
Xu Ma, Can Qin, Haoxuan You, Haoxi Ran, and Yun Fu. Rethinking network design and local geometry in point cloud: A simple residual mlp framework.arXiv preprint arXiv:2202.07123, 2022. 4
-
[37]
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis.Communications of the ACM, 65(1):99–106, 2021. 2
work page 2021
-
[38]
PhD thesis, Massachusetts Institute of Technology, 2022
Teddy Ort.Autonomous navigation without HD prior maps. PhD thesis, Massachusetts Institute of Technology, 2022. 2
work page 2022
-
[39]
Surfels: Surface elements as rendering primitives
Hanspeter Pfister, Matthias Zwicker, Jeroen Van Baar, and Markus Gross. Surfels: Surface elements as rendering primitives. InProceedings of the 27th annual conference on Computer graphics and interactive techniques, pages 335–342, 2000. 2, 3
work page 2000
-
[40]
Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d
Jonah Philion and Sanja Fidler. Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d. InEuropean conference on computer vision, pages 194–210. Springer, 2020. 2, 4, 6
work page 2020
-
[41]
Pillarnet: Real-time and high-performance pillar-based 3d object detection
Guangsheng Shi, Ruifeng Li, and Chao Ma. Pillarnet: Real-time and high-performance pillar-based 3d object detection. InECCV, 2022. 6
work page 2022
-
[42]
Shaoshuai Shi, Zhe Wang, Jianping Shi, Xiaogang Wang, and Hongsheng Li. From points to parts: 3d object detection from point cloud with part-aware and part-aggregation network. In TPAMI, 2019. 6
work page 2019
-
[43]
Shaoshuai Shi, Li Jiang, Jiajun Deng, Zhe Wang, Chaoxu Guo, Jianping Shi, Xiaogang Wang, and Hongsheng Li. Pv-rcnn++: Point-voxel feature set abstraction with local vector representation for 3d object detection. InIJCV, 2023. 6
work page 2023
-
[44]
Complex-yolo: An euler-region-proposal for real-time 3d object detection on point clouds
Martin Simony, Stefan Milzy, Karl Amendey, and Horst-Michael Gross. Complex-yolo: An euler-region-proposal for real-time 3d object detection on point clouds. InProceedings of the European conference on computer vision (ECCV) workshops, pages 0–0, 2018. 2
work page 2018
-
[45]
Scalability in perception for autonomous driving: Waymo open dataset
Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Yu Zhang, Jonathon Shlens, Zhifeng Chen, and Dragomir Anguelov. Scalability in perception...
work page 2020
-
[46]
Rsn: Range sparse net for efficient, accurate lidar 3d object detection
Pei Sun, Weiyue Wang, Yuning Chai, Gamaleldin Elsayed, Alex Bewley, Xiao Zhang, Cristian Sminchisescu, and Dragomir Anguelov. Rsn: Range sparse net for efficient, accurate lidar 3d object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5725–5734, 2021. 5
work page 2021
-
[47]
Swformer: Sparse window transformer for 3d object detection in point clouds
Pei Sun, Mingxing Tan, Weiyue Wang, Chenxi Liu, Fei Xia, Zhaoqi Leng, and Dragomir Anguelov. Swformer: Sparse window transformer for 3d object detection in point clouds. InECCV, 2022. 2, 3, 4, 5, 6, 7
work page 2022
-
[48]
Block-nerf: Scalable large scene neural view synthesis
Matthew Tancik, Vincent Casser, Xinchen Yan, Sabeek Pradhan, Ben Mildenhall, Pratul P Srinivasan, Jonathan T Barron, and Henrik Kretzschmar. Block-nerf: Scalable large scene neural view synthesis. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8248–8258, 2022. 2
work page 2022
-
[49]
Fully convolutional one-stage 3d object detection on liDAR range images
Zhi Tian, Xiangxiang Chu, Xiaoming Wang, Xiaolin Wei, and Chunhua Shen. Fully convolutional one-stage 3d object detection on liDAR range images. InNIPS, 2022. 2
work page 2022
-
[50]
Dsvt: Dynamic sparse voxel transformer with rotated sets
Haiyang Wang, Chen Shi, Shaoshuai Shi, Meng Lei, Sen Wang, Di He, Bernt Schiele, and Liwei Wang. Dsvt: Dynamic sparse voxel transformer with rotated sets. InCVPR, 2023. 6
work page 2023
-
[51]
Benny Wijaya, Kun Jiang, Mengmeng Yang, Tuopu Wen, Yunlong Wang, Xuewei Tang, Zheng Fu, Taohua Zhou, and Diange Yang. High definition map mapping and update: A general overview and future directions.arXiv preprint arXiv:2409.09726, 2024. 2
-
[52]
3dgut: Enabling distorted cameras and secondary rays in gaussian splatting
Qi Wu, Janick Martinez Esturo, Ashkan Mirzaei, Nicolas Moenne-Loccoz, and Zan Gojcic. 3dgut: Enabling distorted cameras and secondary rays in gaussian splatting. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 26036–26046, 2025. 4
work page 2025
-
[53]
Jinsheng Xiao, Shurui Wang, Jian Zhou, Ziyue Tian, Hongping Zhang, and Yuan-Fang Wang. Mim: High-definition maps incorporated multi-view 3d object detection.IEEE Transactions on Intelligent Transportation Systems, 26(3):3989–4001, 2025. 2
work page 2025
-
[54]
Neural map prior for autonomous driving
Xuan Xiong, Yicheng Liu, Tianyuan Yuan, Yue Wang, Yilun Wang, and Zhao Hang. Neural map prior for autonomous driving. InProceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), 2023. 2, 3
work page 2023
-
[55]
Second: Sparsely embedded convolutional detection
Yan Yan, Yuxing Mao, and Bo Li. Second: Sparsely embedded convolutional detection. InSensors, 2018. 2, 6
work page 2018
-
[56]
Hdnet: Exploiting hd maps for 3d object detection
Bin Yang, Ming Liang, and Raquel Urtasun. Hdnet: Exploiting hd maps for 3d object detection. InProceedings of The 2nd Conference on Robot Learning, pages 146–155. PMLR, 2018. 2
work page 2018
-
[57]
Surfelgan: Synthesizing realistic sensor data for autonomous driving.arXiv, 2020
Zhenpei Yang, Yuning Chai, Dragomir Anguelov, Yin Zhou, Pei Sun, Dumitru Erhan, Sean Rafferty, and Henrik Kretzschmar. Surfelgan: Synthesizing realistic sensor data for autonomous driving.arXiv, 2020. 2, 3
work page 2020
-
[58]
3d-man: 3d multi-frame attention network for object detection
Zetong Yang, Yin Zhou, Zhifeng Chen, and Jiquan Ngiam. 3d-man: 3d multi-frame attention network for object detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1863–1872,
-
[59]
Center-based 3d object detection and tracking
Tianwei Yin, Xingyi Zhou, and Philipp Krahenbuhl. Center-based 3d object detection and tracking. InCVPR,
-
[60]
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You, Jing Li, Sashank Reddi, Jonathan Hseu, Sanjiv Kumar, Srinadh Bhojanapalli, Xiaodan Song, James Demmel, Kurt Keutzer, and Cho-Jui Hsieh. Large batch optimization for deep learning: Training bert in 76 minutes.arXiv preprint arXiv:1904.00962, 2019. 6
work page internal anchor Pith review Pith/arXiv arXiv 1904
-
[61]
Yurong You, Katie Z Luo, Xiangyu Chen, Junan Chen, Wei-Lun Chao, Wen Sun, Bharath Hariharan, Mark Campbell, and Kilian Q Weinberger. Hindsight is 20/20: Leveraging past traversals to aid 3d perception.arXiv preprint arXiv:2203.11405, 2022. 2
-
[62]
Motiontrack: End-to-end transformer-based multi-object tracking with lidar-camera fusion
Ce Zhang, Chengjie Zhang, Yiluan Guo, Lingji Chen, and Michael Happold. Motiontrack: End-to-end transformer-based multi-object tracking with lidar-camera fusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 151–160,
-
[63]
HEDNet: A hierarchical encoder-decoder network for 3d object detection in point clouds
Gang Zhang, Junnan Chen, Guohuan Gao, Jianmin Li, and Xiaolin Hu. HEDNet: A hierarchical encoder-decoder network for 3d object detection in point clouds. InNeurIPS,
-
[64]
Safdnet: A simple and effective network for fully sparse 3d object detection
Gang Zhang, Junnan Chen, Guohuan Gao, Jianmin Li, Si Liu, and Xiaolin Hu. Safdnet: A simple and effective network for fully sparse 3d object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14477–14486, 2024. 2, 6, 7
work page 2024
-
[65]
Iou loss for 2d/3d object detection
Dingfu Zhou, Jin Fang, Xibin Song, Chenye Guan, Junbo Yin, Yuchao Dai, and Ruigang Yang. Iou loss for 2d/3d object detection. In2019 international conference on 3D vision (3DV), pages 85–94. IEEE, 2019. 6
work page 2019
-
[66]
Xingyi Zhou, Dequan Wang, and Philipp Kr¨ahenb¨uhl. Objects as points.arXiv preprint arXiv:1904.07850, 2019. 5
work page internal anchor Pith review Pith/arXiv arXiv 1904
-
[67]
On the continuity of rotation representations in neural networks
Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li. On the continuity of rotation representations in neural networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5745–5753,
-
[68]
End-to-end multi-view fusion for 3d object detection in lidar point clouds
Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Tom Ouyang, James Guo, Jiquan Ngiam, and Vijay Vasudevan. End-to-end multi-view fusion for 3d object detection in lidar point clouds. InConference on Robot Learning, pages 923–932. PMLR, 2020. 4
work page 2020
-
[69]
Centerformer: Center-based transformer for 3d object detection
Zixiang Zhou, Xiangchen Zhao, Yu Wang, Panqu Wang, and Hassan Foroosh. Centerformer: Center-based transformer for 3d object detection. InECCV, 2022. 2, 6, 7
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.