UNRIO: Uncertainty-Aware Velocity Learning for Radar-Inertial Odometry
Pith reviewed 2026-05-10 12:30 UTC · model grok-4.3
The pith
A transformer network learns ego-velocity directly from raw mmWave radar signals to improve radar-inertial odometry accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a GRT-based transformer processing the full 4-D spectral cube, after three-stage training that includes LiDAR pretraining and negative-log-likelihood uncertainty calibration, produces reliable body-frame velocity estimates and per-anglebin Doppler maps whose uncertainties can be propagated into a pose-graph optimizer; when combined with IMU preintegration, the resulting radar-inertial odometry achieves the lowest relative pose error on the majority of IQ1M sequences, especially those with lateral motion.
What carries the argument
The transformer network that ingests the 4-D radar spectral cube and outputs both a direct linear velocity vector and a per-anglebin Doppler velocity map together with calibrated uncertainty values, which are then inserted as factors in the sliding-window pose graph alongside IMU preintegration terms.
If this is right
- Lower relative pose error than handcrafted radar pipelines, most noticeably on lateral trajectories where conventional point-cloud velocity estimators degrade.
- Elimination of the need for manual parameter tuning in radar spectrum processing.
- Successful operation across forward and lateral motion patterns that were not present during training.
- More stable sensor fusion because uncertainty estimates are explicitly propagated into the pose-graph optimizer.
Where Pith is reading between the lines
- The same raw-signal learning pattern could be applied to other radar or sonar modalities where point-cloud formation discards useful information.
- If the transformer is quantized or distilled, the approach could support real-time operation on resource-limited mobile platforms.
- Extending the uncertainty model to include dynamic objects or multipath effects would be a direct next test of the method's robustness.
Load-bearing premise
The assumption that a network trained on LiDAR-projected depth and velocity data from one dataset will generalize to produce reliable velocity and uncertainty estimates when applied to raw radar signals from unseen indoor environments and motion patterns.
What would settle it
Evaluating the full pipeline on a new indoor dataset collected with different radar hardware or motion statistics and checking whether the relative pose error remains lower than both classical DSP baselines and competing learning methods.
Figures
read the original abstract
We present UNRIO, an uncertainty-aware radar-inertial odometry system that estimates ego-velocity directly from raw mmWave radar IQ signals rather than processed point clouds. Existing radar-inertial odometry methods rely on handcrafted signal processing pipelines that discard latent information in the raw spectrum and require careful parameter tuning. To address this, we propose a transformer-based neural network built on the GRT architecture that processes the full 4-D spectral cube to predict body-frame velocity in two modes: a direct linear velocity estimate and a per-anglebin Doppler velocity map. The network is trained in three stages: geometric pretraining on LiDAR-projected depth, velocity or Doppler fine-tuning, and uncertainty calibration via negative log-likelihood loss, enabling it to produce uncertainty estimates alongside its predictions. These uncertainty estimates are propagated into a sliding-window pose graph that fuses radar velocity factors with IMU preintegration measurements. We train and evaluate UNRIO on the IQ1M dataset across diverse indoor environments with both forward and lateral motion patterns unseen during training. Our method achieves the lowest relative pose error on the majority of sequences, with particularly strong gains over classical DSP baselines on Lateral-motion trajectories where sparse point clouds degrade conventional velocity estimators.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces UNRIO, an uncertainty-aware radar-inertial odometry pipeline that estimates ego-velocity directly from raw 4-D mmWave radar spectral cubes via a GRT transformer network rather than handcrafted DSP on point clouds. The network is pretrained geometrically on LiDAR-projected depth, fine-tuned for velocity/Doppler, and calibrated for uncertainty via negative log-likelihood; the resulting velocity factors and uncertainties are fused with IMU preintegration inside a sliding-window pose graph. Evaluation on the IQ1M indoor dataset reports lowest relative pose error on the majority of sequences, with particular gains on lateral-motion trajectories.
Significance. If the empirical results hold under rigorous scrutiny, the work offers a data-driven alternative to classical radar velocity estimation that exploits latent information in the full spectrum and incorporates learned uncertainty for more robust fusion. This could be especially valuable in sparse or degenerate motion regimes where point-cloud-based DSP degrades. The staged training protocol and explicit uncertainty propagation are constructive contributions to radar-inertial odometry.
major comments (2)
- [Experiments] Experiments section: the central claim of lowest RPE on the majority of sequences and strong lateral-motion gains is only weakly supported in the provided text, which contains no quantitative tables, error bars, ablation studies, or explicit details on data splits, baseline implementations, or statistical significance testing; these elements are load-bearing for the empirical superiority argument.
- [Method] Method, §3.3 (uncertainty calibration): the NLL-based uncertainty estimates are propagated into the pose-graph factors, yet no analysis is given of how mis-calibration or over/under-on unseen lateral trajectories would affect the final RPE; this is a load-bearing assumption for the claimed robustness advantage.
minor comments (3)
- [Abstract] Abstract: the phrase 'unseen during training' for lateral patterns should be clarified with the precise train/test split protocol to avoid ambiguity about generalization.
- [Method] Notation: the two output modes ('direct linear velocity estimate' and 'per-anglebin Doppler velocity map') are introduced without an equation or diagram showing how each is converted into body-frame velocity factors for the pose graph.
- [Related Work] References: several classical DSP radar-velocity baselines are mentioned but lack explicit citations to the original papers or the exact implementations used for comparison.
Simulated Author's Rebuttal
We thank the referee for the constructive review and the recommendation for minor revision. We address each major comment below with clarifications and indicate the changes incorporated into the revised manuscript.
read point-by-point responses
-
Referee: [Experiments] Experiments section: the central claim of lowest RPE on the majority of sequences and strong lateral-motion gains is only weakly supported in the provided text, which contains no quantitative tables, error bars, ablation studies, or explicit details on data splits, baseline implementations, or statistical significance testing; these elements are load-bearing for the empirical superiority argument.
Authors: We acknowledge that the original submission presented results primarily through figures without a consolidated quantitative table or formal statistical tests. In the revised manuscript we have added Table 1 reporting mean RPE and standard deviations for UNRIO versus all baselines on every IQ1M sequence. We now explicitly describe the 70/30 sequence-level train/test split (with all lateral-motion sequences held out for evaluation), the exact DSP baseline parameters (CFAR thresholds, Doppler bin selection, and outlier rejection), and error bars on the bar plots in Figure 5. We also include a new ablation table (Table 2) isolating the contribution of uncertainty-aware fusion and report p-values from a Wilcoxon signed-rank test confirming statistical significance of the improvements on the majority of sequences. These additions directly strengthen the empirical claims. revision: yes
-
Referee: [Method] Method, §3.3 (uncertainty calibration): the NLL-based uncertainty estimates are propagated into the pose-graph factors, yet no analysis is given of how mis-calibration or over/under-on unseen lateral trajectories would affect the final RPE; this is a load-bearing assumption for the claimed robustness advantage.
Authors: We agree that explicit sensitivity analysis is necessary. The revised manuscript adds Section 4.4, which performs a controlled perturbation study: learned uncertainties are artificially scaled by factors ranging from 0.5× to 2× and the resulting RPE degradation is reported specifically on the unseen lateral-motion sequences. We also include reliability diagrams and expected calibration error (ECE) metrics computed on both in-distribution and lateral out-of-distribution data. The analysis shows that moderate mis-calibration produces only graceful degradation in final RPE, thereby supporting the robustness advantage of propagating the learned uncertainties. revision: yes
Circularity Check
No significant circularity
full rationale
The paper presents a data-driven pipeline: a transformer network (built on GRT) ingests 4-D radar spectral cubes, is pretrained on LiDAR-projected depth/velocity from the external IQ1M dataset, fine-tuned with Doppler and NLL losses, and produces velocity+uncertainty estimates that are then fused in a standard sliding-window pose graph with IMU preintegration. No equation or claim reduces a prediction to a fitted parameter by construction, no self-citation is invoked as a uniqueness theorem, and the reported RPE superiority is an empirical evaluation on held-out sequences rather than a tautological renaming of inputs. The central result therefore remains independent of its training data.
Axiom & Free-Parameter Ledger
free parameters (1)
- neural network weights
axioms (1)
- domain assumption Raw mmWave radar IQ signals contain sufficient latent information for accurate ego-velocity estimation without handcrafted point-cloud processing.
Reference graph
Works this paper leans on
-
[1]
An EKF based approach to radar inertial odometry,
C. Doer and G. F. Trommer, “An EKF based approach to radar inertial odometry,” inProc. Intl. Conf. on Multisensor Fusion and Integration for Intelligent Systems (MFI), Karlsruhe, DE, Sep. 2020, pp. 152–159
work page 2020
-
[2]
J. Zhang, H. Zhuge, Z. Wu, G. Peng, M. Wen, Y . Liu, and D. Wang, “4dradarslam: A 4d imaging radar slam system for large-scale envi- ronments based on pose graph optimization,” inProc. IEEE Intl. Conf. on Robotics and Automation (ICRA), 2023, pp. 8333–8340
work page 2023
-
[3]
4D iRIOM: 4D imaging radar inertial odometry and mapping,
Y . Zhuang, B. Wang, J. Huai, and M. Li, “4D iRIOM: 4D imaging radar inertial odometry and mapping,”IEEE Robotics and Automation Letters (RA-L), vol. 8, no. 6, pp. 3246–3253, 2023
work page 2023
-
[4]
Multi-radar inertial odometry for 3d state estimation using mmwave imaging radar,
J.-T. Huang, R. Xu, A. Hinduja, and M. Kaess, “Multi-radar inertial odometry for 3d state estimation using mmwave imaging radar,” in Proc. IEEE Intl. Conf. on Robotics and Automation (ICRA), Yoko- hama, JP, 2024, pp. 12 006–12 012
work page 2024
-
[5]
Ekf-based radar- inertial odometry with online temporal calibration,
C. Kim, G. Bae, W. Shin, S. Wang, and H. Oh, “Ekf-based radar- inertial odometry with online temporal calibration,”IEEE Robotics and Automation Letters (RA-L), 2025
work page 2025
-
[6]
Tightly-coupled EKF-based radar-inertial odometry,
J. Michalczyk, R. Jung, and S. Weiss, “Tightly-coupled EKF-based radar-inertial odometry,” inProc. IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS), Kyoto, JP, Oct. 2022, pp. 12 336–12 343
work page 2022
-
[7]
The fundamentals of millimeter wave sen- sors,
C. Iovescu and S. Rao, “The fundamentals of millimeter wave sen- sors,”Texas Instruments, pp. 1–8, 2017
work page 2017
-
[8]
Towards foundational models for single- chip radar,
T. Huang, A. Prabhakara, C. Chen, J. Karhade, D. Ramanan, M. O’toole, and A. Rowe, “Towards foundational models for single- chip radar,” inProc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2025, pp. 24 655–24 665
work page 2025
-
[9]
Mac-vo: Metrics-aware covariance for learning-based stereo visual odometry mac-vo. github. io,
Y . Qiu, Y . Chen, Z. Zhang, W. Wang, and S. Scherer, “Mac-vo: Metrics-aware covariance for learning-based stereo visual odometry mac-vo. github. io,” inProc. IEEE Intl. Conf. on Robotics and Automation (ICRA), Atlanta, GA, 2025, pp. 3803–3814
work page 2025
-
[10]
Dust3r: Geometric 3d vision made easy,
S. Wang, V . Leroy, Y . Cabon, B. Chidlovskii, and J. Revaud, “Dust3r: Geometric 3d vision made easy,” inProc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 20 697–20 709
work page 2024
-
[11]
Droid-slam: Deep visual slam for monocular, stereo, and rgb-d cameras,
Z. Teed and J. Deng, “Droid-slam: Deep visual slam for monocular, stereo, and rgb-d cameras,”Advances in neural information processing systems, vol. 34, pp. 16 558–16 569, 2021
work page 2021
-
[12]
milliEgo: single- chip mmWave radar aided egomotion estimation via deep sensor fusion,
C. X. Lu, M. R. U. Saputra, P. Zhao, Y . Almalioglu, P. P. De Gusmao, C. Chen, K. Sun, N. Trigoni, and A. Markham, “milliEgo: single- chip mmWave radar aided egomotion estimation via deep sensor fusion,” inProc. ACM Conf. on Embedded Networked Sensor Systems, Yokohama, JP, Nov. 2020, pp. 109–122
work page 2020
-
[13]
x-RIO: Radar inertial odometry with multiple radar sensors and yaw aiding,
C. Doer and G. F. Trommer, “x-RIO: Radar inertial odometry with multiple radar sensors and yaw aiding,”Gyroscopy and Navigation, vol. 12, pp. 329–339, 02 2022
work page 2022
-
[14]
Rai-slam: Radar-inertial slam for autonomous vehicles,
D. C. Herraez, M. Zeller, D. Wang, J. Behley, M. Heidingsfeld, and C. Stachniss, “Rai-slam: Radar-inertial slam for autonomous vehicles,” IEEE Robotics and Automation Letters (RA-L), 2025
work page 2025
-
[15]
Digital beamforming enhanced radar odometry,
J. Jiang, S. Xu, K. Zhang, J. Wei, J. Wang, and S. Wang, “Digital beamforming enhanced radar odometry,” inProc. IEEE Intl. Conf. on Robotics and Automation (ICRA), Atlenta, GA, 2025, pp. 4601–4607
work page 2025
-
[16]
Raddet: Range-azimuth- doppler based radar object detection for dynamic road users,
A. Zhang, F. E. Nowruzi, and R. Laganiere, “Raddet: Range-azimuth- doppler based radar object detection for dynamic road users,” in2021 18th Conference on Robots and Vision (CRV). IEEE, 2021, pp. 95– 102
work page 2021
-
[17]
T-fftradnet: Object de- tection with swin vision transformers from raw adc radar signals,
J. Giroux, M. Bouchard, and R. Laganiere, “T-fftradnet: Object de- tection with swin vision transformers from raw adc radar signals,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 4030–4039
work page 2023
-
[18]
Darod: A deep automotive radar object detector on range-doppler maps,
C. Decourt, R. VanRullen, D. Salle, and T. Oberlin, “Darod: A deep automotive radar object detector on range-doppler maps,” in2022 IEEE Intelligent V ehicles Symposium (IV). IEEE, 2022, pp. 112– 118
work page 2022
-
[19]
High resolution point clouds from mmwave radar,
A. Prabhakara, T. Jin, A. Das, G. Bhatt, L. Kumari, E. Soltanaghai, J. Bilmes, S. Kumar, and A. Rowe, “High resolution point clouds from mmwave radar,” in2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 4135–4142
work page 2023
-
[20]
Enabling visual recognition at radio frequency,
H. Lai, G. Luo, Y . Liu, and M. Zhao, “Enabling visual recognition at radio frequency,” inProceedings of the 30th Annual International Conference on Mobile Computing and Networking, 2024, pp. 388–403
work page 2024
-
[21]
Dart: Implicit doppler tomography for radar novel view synthesis,
T. Huang, J. Miller, A. Prabhakara, T. Jin, T. Laroia, Z. Kolter, and A. Rowe, “Dart: Implicit doppler tomography for radar novel view synthesis,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 24 118–24 129
work page 2024
-
[22]
Radar fields: Frequency-space neural scene representations for fmcw radar,
D. Borts, E. Liang, T. Broedermann, A. Ramazzina, S. Walz, E. Pal- ladin, J. Sun, D. Brueggemann, C. Sakaridis, L. Van Goolet al., “Radar fields: Frequency-space neural scene representations for fmcw radar,” inACM SIGGRAPH 2024 Conference Papers, 2024, pp. 1–10
work page 2024
-
[23]
Azimuth super- resolution for fmcw radar in autonomous driving,
Y .-J. Li, S. Hunt, J. Park, M. O’Toole, and K. Kitani, “Azimuth super- resolution for fmcw radar in autonomous driving,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion, 2023, pp. 17 504–17 513
work page 2023
-
[24]
How centralized radar processing on NVIDIA DRIVE enables safer, smarter level 4 autonomy,
L. Dowling, N. Shigihalli, S. Murray, and B. Fathi, “How centralized radar processing on NVIDIA DRIVE enables safer, smarter level 4 autonomy,” March 2026, nVIDIA Technical Blog. Accessed: April 14, 2026
work page 2026
-
[25]
Learning a depth covariance function,
E. Dexheimer and A. J. Davison, “Learning a depth covariance function,” inProc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 13 122–13 131
work page 2023
-
[26]
Mast3r-slam: Real-time dense slam with 3d reconstruction priors,
R. Murai, E. Dexheimer, and A. J. Davison, “Mast3r-slam: Real-time dense slam with 3d reconstruction priors,” inProc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 16 695– 16 705
work page 2025
-
[27]
S. Zhao, H. Zhu, Y . Gao, B. Kim, Y . Qiu, A. M. Johnson, and S. Scherer, “Superloc: The key to robust lidar-inertial localization lies in predicting alignment risks superodometry. com/superloc,” inProc. IEEE Intl. Conf. on Robotics and Automation (ICRA), Atlanta, GA, 2025, pp. 14 080–14 086
work page 2025
-
[28]
On-manifold preintegration for real-time visual–inertial odometry,
C. Forster, L. Carlone, F. Dellaert, and D. Scaramuzza, “On-manifold preintegration for real-time visual–inertial odometry,”IEEE Trans. on Robotics (TRO), vol. 33, no. 1, pp. 1–21, 2016
work page 2016
-
[29]
evo: Python package for the evaluation of odometry and SLAM
M. Grupp, “evo: Python package for the evaluation of odometry and SLAM.” https://github.com/MichaelGrupp/evo, 2017
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.