Meridian: Metric-Semantic Primitive Matching for Cross-View Geo-Localization Beyond Urban Environments

Camillo Jose Taylor; Carlos Nieto-Granda; Fernando Cladera; Jonathan P. How; Mason Peterson; Qingyuan Li; Yixuan Jia

arxiv: 2606.06312 · v1 · pith:W7UYP4MXnew · submitted 2026-06-04 · 💻 cs.RO

Meridian: Metric-Semantic Primitive Matching for Cross-View Geo-Localization Beyond Urban Environments

Mason Peterson , Qingyuan Li , Yixuan Jia , Fernando Cladera , Carlos Nieto-Granda , Camillo Jose Taylor , Jonathan P. How This is my paper

Pith reviewed 2026-06-28 00:48 UTC · model grok-4.3

classification 💻 cs.RO

keywords geo-localizationcross-view matchingmetric-semantic primitivesrobot navigationaerial imagerypose optimizationGNSS-denied environmentsunstructured terrain

0 comments

The pith

Meridian matches metric-semantic primitives between aerial imagery and ground RGB-D data to localize robots globally without environment-specific training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces Meridian as a technique for cross-view geo-localization by matching high-level metric-semantic primitives from overhead aerial images to ground robot camera observations. Novel consistency metrics are used to build a distribution over possible submap poses and to filter out bad hypotheses before pose graph optimization. The goal is to achieve reliable localization in GNSS-denied areas that include both structured and unstructured natural settings. A reader would care if this removes the need for retraining models when moving to new terrains like parks, campuses, or wilderness. The result is demonstrated as 2.4 meter average error on 19 kilometers of robot travel across multiple datasets.

Core claim

The central discovery is that matching metric-semantic primitives across aerial and ground views, combined with consistency metrics for pose estimation and outlier rejection, allows accurate global localization of ground robots in diverse environments without any training or fine-tuning on area-specific data.

What carries the argument

metric-semantic primitive matching using novel consistency metrics to estimate pose distributions and reject outliers within a pose graph optimization framework

If this is right

Accurate localization supports repeatable robot tasks and safe operation in GNSS-denied outdoor areas.
The approach handles repetitive geometries and featureless landscapes common in natural terrain.
Generalization occurs across autonomous driving, park, campus, and wilderness environments without retraining.
Trajectory estimation benefits from robust rejection of outlier hypotheses during optimization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Extending the primitive matching to additional sensor modalities could broaden its use in multi-robot systems.
Applying similar consistency checks might improve other cross-view localization techniques in challenging conditions.
Long-term operation in changing environments could be tested by repeating traversals over time.
This method suggests potential for fully training-free global localization in robotics.

Load-bearing premise

The consistency metrics reliably estimate distributions over submap poses and reject outliers in repetitive or featureless areas without area-specific training or fine-tuning.

What would settle it

Observing high trajectory errors or failure to localize in a previously unseen repetitive geometry or featureless landscape would indicate the metrics do not generalize as claimed.

Figures

Figures reproduced from arXiv: 2606.06312 by Camillo Jose Taylor, Carlos Nieto-Granda, Fernando Cladera, Jonathan P. How, Mason Peterson, Qingyuan Li, Yixuan Jia.

**Figure 2.** Figure 2: Our cross-view localization pipeline begins by extracting [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Example aerial and ground segment maps converted to [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Point and line association consistency scoring. The ground [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Cross-view pose graph. Aerial patches are connected via [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Top row and bottom right two images: self-collected aerial images and the final optimized trajectory of each sequence shown in a [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

read the original abstract

Successful robot automation requires accurate global localization to support repeatability, task planning, goal specification, and safe operation. However, reliable localization in GNSS-denied environments remains an open problem. Overhead aerial imagery offers a promising solution, but existing approaches primarily target structured urban environments and have been rarely demonstrated in unstructured natural terrain. Limitations of the state-of-the-art include a reliance on models trained for specific environments, as well as difficulty handling repetitive geometries and featureless landscapes commonly found in natural outdoor areas. To overcome these challenges, we present Meridian, a method for matching high-level metric-semantic primitives across aerial images and ground robot RGB-D camera data that achieves accurate global localization and generalizes well across diverse environments, all without any training or algorithmic fine-tuning on area-specific data. We formulate novel consistency metrics to estimate a distribution over robot submap poses and to reject outlier hypotheses in a robust pose graph optimization step for accurate robot trajectory estimation. We demonstrate that our algorithm can localize a ground robot across a wide variety of environments, including an autonomous driving dataset, a park and campus area, and a wilderness camp, with an average optimized trajectory error of 2.4 m over 19 km of ground traversal.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Meridian claims a training-free primitive matching method for cross-view localization that hits 2.4 m error over 19 km in mixed terrain, but the consistency metrics lack any visible formulation or validation against sparse features.

read the letter

Meridian's core contribution is a method for cross-view geo-localization that matches high-level metric-semantic primitives between aerial images and ground robot RGB-D data. It claims this works across urban driving, park, campus, and wilderness settings without any area-specific training, delivering an average optimized trajectory error of 2.4 meters over 19 kilometers of ground traversal.

What the paper does is introduce novel consistency metrics that estimate a distribution over possible robot submap poses and then use those to reject outlier hypotheses inside a pose graph optimization. This is positioned as a way to deal with the repetitive geometries and featureless landscapes that are common outside cities.

The approach has some merit in its problem statement. Reliable GNSS-denied localization in unstructured terrain is indeed a longstanding issue for applications like disaster response and environmental monitoring. Testing on multiple environment types is better than the usual urban-only evaluations.

The soft spots are noticeable. The abstract supplies no equations or pseudocode for the consistency metrics, no ablation experiments showing how performance changes with primitive count or density, and no per-dataset error breakdowns. The wilderness camp result is lumped into the average, so it is not clear how well the method actually performs where features are sparse. The stress-test concern about the metrics' ability to produce usable pose distributions and reject outliers in low-density cases is not addressed with evidence in the provided text. If those metrics implicitly rely on having enough primitives, the generalization claim weakens.

The paper does not appear to have load-bearing fitting or internal contradictions based on the abstract.

This paper would be of interest to researchers in robot perception and localization who are looking for non-learned, generalizable techniques. Someone working on field autonomy might pick up the primitive matching idea even if they end up modifying it.

I would recommend sending it for peer review. The topic is relevant and the high-level framing is reasonable, but the referees will need to see the full methods section and additional validation to judge whether the results support the claims.

Referee Report

2 major / 2 minor

Summary. The manuscript presents Meridian, a training-free approach to cross-view geo-localization that matches high-level metric-semantic primitives extracted from aerial imagery against ground-robot RGB-D submaps. Novel consistency metrics are introduced to produce a distribution over submap poses and to reject outliers inside a robust pose-graph optimization stage; the central empirical claim is an average optimized trajectory error of 2.4 m across 19 km of traversal spanning an autonomous-driving dataset, park/campus scenes, and a wilderness camp.

Significance. If the reported accuracy and zero-shot generalization hold, the work would meaningfully extend metric-semantic localization beyond the urban settings that dominate the literature, offering a practical route to reliable GNSS-denied operation in unstructured natural terrain.

major comments (2)

[Abstract] Abstract (and §4 experiments): the headline 2.4 m / 19 km result across wilderness data is load-bearing on the claim that the consistency metrics can both produce a usable pose distribution and reject outliers when primitive density is low; no formulation of the metrics, ablation on their sensitivity to primitive sparsity, or per-environment rejection-rate breakdown is supplied to substantiate the no-training generalization.
[Method] Method section (consistency-metric definitions): without the explicit equations or algorithmic description of how the metrics estimate pose distributions and perform outlier rejection, it remains unclear whether they implicitly require a minimum feature density that is routinely absent in repetitive or featureless wilderness geometry.

minor comments (2)

The abstract would be clearer if it named the concrete primitive types (e.g., planes, lines, semantic classes) extracted from both aerial and ground data.
Figure captions and axis labels in the experimental section should explicitly state whether the reported errors are before or after the final pose-graph optimization step.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The two major comments highlight the need for greater clarity on the consistency metrics. We address each point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract (and §4 experiments): the headline 2.4 m / 19 km result across wilderness data is load-bearing on the claim that the consistency metrics can both produce a usable pose distribution and reject outliers when primitive density is low; no formulation of the metrics, ablation on their sensitivity to primitive sparsity, or per-environment rejection-rate breakdown is supplied to substantiate the no-training generalization.

Authors: We agree that the headline result relies on the metrics' ability to handle low-density primitives. The revised manuscript will include the full mathematical formulation of the consistency metrics in §3, an ablation study varying primitive density (including wilderness subsets), and a per-environment breakdown of outlier rejection rates. These additions will directly substantiate the zero-shot generalization claim. revision: yes
Referee: [Method] Method section (consistency-metric definitions): without the explicit equations or algorithmic description of how the metrics estimate pose distributions and perform outlier rejection, it remains unclear whether they implicitly require a minimum feature density that is routinely absent in repetitive or featureless wilderness geometry.

Authors: The current manuscript describes the metrics at a high level but lacks the requested explicit equations and pseudocode. We will expand §3 with the complete equations for pose-distribution estimation and the robust outlier-rejection procedure inside the pose-graph optimizer, plus a short analysis of behavior under sparse or repetitive geometry. This will clarify that the metrics do not presuppose a minimum feature density. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on novel metrics

full rationale

The abstract and description present Meridian as introducing new consistency metrics formulated to estimate pose distributions and reject outliers, with demonstrations across environments without training or fine-tuning. No equations, self-citations, or fitted parameters are shown that reduce the central claims (2.4 m error over 19 km) to inputs by construction. The method is self-contained against external benchmarks via cross-environment validation, consistent with the reader's assessment of score 2.0 but warranting 0 given absence of load-bearing reductions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the consistency metrics and primitives are described at a conceptual level only.

pith-pipeline@v0.9.1-grok · 5764 in / 1083 out tokens · 28827 ms · 2026-06-28T00:48:50.437942+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 3 canonical work pages · 1 internal anchor

[1]

Part1 prelude,

L. Carlone, A. Kim, T. Barfoot, D. Cremers, and F. Dellaert, “Part1 prelude,” inSLAM Handbook. From Localization and Mapping to Spatial Intelligence, L. Carlone, A. Kim, T. Barfoot, D. Cremers, and F. Dellaert, Eds. Cambridge University Press, 2026

2026
[2]

Satellite image-based localization via learned embeddings,

D.-K. Kim and M. R. Walter, “Satellite image-based localization via learned embeddings,” in2017 IEEE international conference on robotics and automation (ICRA). IEEE, 2017, pp. 2073–2080

2017
[3]

Satellite image based cross-view localization for autonomous vehicle,

S. Wang, Y . Zhang, A. V ora, A. Perincherry, and H. Li, “Satellite image based cross-view localization for autonomous vehicle,” in2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 3592–3599

2023
[4]

Any way you look at it: Semantic crossview localization and mapping with lidar,

I. D. Miller, A. Cowley, R. Konkimalla, S. S. Shivakumar, T. Nguyen, T. Smith, C. J. Taylor, and V . Kumar, “Any way you look at it: Semantic crossview localization and mapping with lidar,”IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 2397–2404, 2021

2021
[5]

Adaptive teams of autonomous aerial and ground robots for situational aware- ness,

M. A. Hsieh, A. Cowley, J. F. Keller, L. Chaimowicz, B. Grocholsky, V . Kumar, C. J. Taylor, Y . Endo, R. C. Arkin, B. Jung,et al., “Adaptive teams of autonomous aerial and ground robots for situational aware- ness,”Journal of field robotics, vol. 24, no. 11-12, pp. 991–1014, 2007

2007
[6]

Fgˆ 2: Fine-grained cross-view localization by fine-grained feature matching,

Z. Xia and A. Alahi, “Fgˆ 2: Fine-grained cross-view localization by fine-grained feature matching,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 6362–6372

2025
[7]

Increasing slam pose accuracy by ground-to-satellite image registration,

Y . Zhang, Y . Shi, S. Wang, A. V ora, A. Perincherry, Y . Chen, and H. Li, “Increasing slam pose accuracy by ground-to-satellite image registration,” in2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 8522–8528

2024
[8]

View from above: Orthogonal-view aware cross- view localization,

S. Wang, C. Nguyen, J. Liu, Y . Zhang, S. Muthu, F. A. Maken, K. Zhang, and H. Li, “View from above: Orthogonal-view aware cross- view localization,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 14 843–14 852

2024
[9]

Beyond cross-view image retrieval: Highly accurate vehicle localization using satellite image,

Y . Shi and H. Li, “Beyond cross-view image retrieval: Highly accurate vehicle localization using satellite image,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022

2022
[10]

Fast segment anything,

X. Zhao, W. Ding, Y . An, Y . Du, T. Yu, M. Li, M. Tang, and J. Wang, “Fast segment anything,”arXiv preprint arXiv:2306.12156, 2023

work page arXiv 2023
[11]

Segment anything,

A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y . Lo,et al., “Segment anything,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 4015–4026

2023
[12]

DINOv2: Learning Robust Visual Features without Supervision

M. Oquab, T. Darcet, T. Moutakanni, H. V o, M. Szafraniec, V . Khali- dov, P. Fernandez, D. Haziza, F. Massa, A. El-Nouby,et al., “Dinov2: Learning robust visual features without supervision,”arXiv preprint arXiv:2304.07193, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[13]

Cross-view image geolocaliza- tion,

T.-Y . Lin, S. Belongie, and J. Hays, “Cross-view image geolocaliza- tion,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 891–898

2013
[14]

Cross- view geo-localization: a survey,

A. Durgam, S. Paheding, V . Dhiman, and V . Devabhaktuni, “Cross- view geo-localization: a survey,”IEEE Access, 2024

2024
[15]

Cvm-net: Cross-view matching network for image-based ground-to-aerial geo-localization,

S. Hu, M. Feng, R. M. Nguyen, and G. H. Lee, “Cvm-net: Cross-view matching network for image-based ground-to-aerial geo-localization,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7258–7267

2018
[16]

City-wide street-to-satellite image geolocalization of a mobile ground agent,

L. M. Downes, D.-K. Kim, T. J. Steiner, and J. P. How, “City-wide street-to-satellite image geolocalization of a mobile ground agent,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 11 102–11 108

2022
[17]

Revisiting cross-view localization from image matching,

P. Xia, Q. Wu, L. Yu, Y . Liu, M. Xiong, L. Liang, Y . Zhang, and Y . Wan, “Revisiting cross-view localization from image matching,” arXiv e-prints, pp. arXiv–2508, 2025

2025
[18]

Uncertainty-aware vision-based metric cross-view geolocal- ization,

F. Fervers, S. Bullinger, C. Bodensteiner, M. Arens, and R. Stiefel- hagen, “Uncertainty-aware vision-based metric cross-view geolocal- ization,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 21 621–21 631

2023
[19]

Geo-localization based on dynamically weighted factor- graph,

M. ´A. Mu ˜noz-Ba˜n´on, A. Olivas, E. Velasco-S ´anchez, F. A. Candelas, and F. Torres, “Geo-localization based on dynamically weighted factor- graph,”IEEE Robotics and Automation Letters, vol. 9, no. 6, pp. 5599– 5606, 2024

2024
[20]

Global local- ization in unstructured environments using semantic object maps built from various viewpoints,

J. Ankenbauer, P. C. Lusk, A. Thomas, and J. P. How, “Global local- ization in unstructured environments using semantic object maps built from various viewpoints,” in2023 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2023, pp. 1358–1365

2023
[21]

Osm-slam: Aiding slam with openstreetmaps priors,

M. Frosi, V . Gobbi, and M. Matteucci, “Osm-slam: Aiding slam with openstreetmaps priors,”Frontiers in Robotics and AI, vol. 10, p. 1064934, 2023

2023
[22]

Global localization using openstreetmap and elevation offsets

A. Przewodowski, F. S. Os ´orio, and V . G. Junior, “Global localization using openstreetmap and elevation offsets.”J. Braz. Comput. Soc., vol. 30, no. 1, pp. 264–273, 2024

2024
[23]

Odometry- assisted lidar-openstreetmap matching method for vehicle global po- sitioning,

Z. Li, R. Zuo, Y . Wang, F. Ding, C. Wei, and M. Wu, “Odometry- assisted lidar-openstreetmap matching method for vehicle global po- sitioning,”IEEE Internet of Things Journal, 2026

2026
[24]

Autonomous vehicle localization without prior high-definition map,

S. Lee and J.-H. Ryu, “Autonomous vehicle localization without prior high-definition map,”IEEE Transactions on Robotics, vol. 40, pp. 2888–2906, 2024

2024
[25]

Fast global registration,

Q.-Y . Zhou, J. Park, and V . Koltun, “Fast global registration,” in European conference on computer vision. Springer, 2016, pp. 766– 782

2016
[26]

Teaser: Fast and certifiable point cloud registration,

H. Yang, J. Shi, and L. Carlone, “Teaser: Fast and certifiable point cloud registration,”IEEE Transactions on Robotics, vol. 37, no. 2, pp. 314–333, 2020

2020
[27]

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,

M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,”Communications of the ACM, vol. 24, no. 6, pp. 381–395, 1981

1981
[28]

CLIPPER: Robust Data Association without an Initial Guess,

P. C. Lusk and J. P. How, “CLIPPER: Robust Data Association without an Initial Guess,”IEEE Robotics and Automation Letters, 2024

2024
[29]

Incremental-segment-based localization in 3-d point clouds,

R. Dub ´e, M. G. Gollub, H. Sommer, I. Gilitschenski, R. Siegwart, C. Cadena, and J. Nieto, “Incremental-segment-based localization in 3-d point clouds,”IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 1832–1839, 2018

2018
[30]

ROMAN: Open-Set Object Map Alignment for Robust View- Invariant Global Localization,

M. B. Peterson, Y . X. Jia, Y . Tian, A. Thomas, and J. P. How, “ROMAN: Open-Set Object Map Alignment for Robust View- Invariant Global Localization,” inRobotics: Science and Systems (RSS), 2025

2025
[31]

GraffMatch: Global matching of 3d lines and planes for wide baseline lidar registration,

P. C. Lusk, D. Parikh, and J. P. How, “GraffMatch: Global matching of 3d lines and planes for wide baseline lidar registration,”IEEE Robotics and Automation Letters, vol. 8, no. 2, pp. 632–639, 2022

2022
[32]

Slim: Scalable and lightweight lidar mapping in urban environments,

Z. Yu, Z. Qiao, W. Liu, H. Yin, and S. Shen, “Slim: Scalable and lightweight lidar mapping in urban environments,”IEEE Transactions on Robotics, 2025

2025
[33]

Distribution estimation for global data association via approximate bayesian infer- ence,

Y . Jia, M. B. Peterson, Q. Li, Y . Tian, and J. P. How, “Distribution estimation for global data association via approximate bayesian infer- ence,”arXiv preprint arXiv:2509.15565, 2025

work page arXiv 2025
[34]

AnyLoc: Towards universal visual place recognition,

N. Keetha, A. Mishra, J. Karhade, K. M. Jatavallabhula, S. Scherer, M. Krishna, and S. Garg, “AnyLoc: Towards universal visual place recognition,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1286–1293, 2023

2023
[35]

Least-squares fitting of two 3-d point sets,

K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,”IEEE Transactions on pattern analysis and machine intelligence, no. 5, pp. 698–700, 1987

1987
[36]

Pair- wise consistent measurement set maximization for robust multi-robot map merging,

J. G. Mangelson, D. Dominic, R. M. Eustice, and R. Vasudevan, “Pair- wise consistent measurement set maximization for robust multi-robot map merging,” in2018 IEEE international conference on robotics and automation (ICRA). IEEE, 2018, pp. 2916–2923

2018
[37]

Kimera-multi: Robust, distributed, dense metric-semantic slam for multi-robot systems,

Y . Tian, Y . Chang, F. H. Arias, C. Nieto-Granda, J. P. How, and L. Carlone, “Kimera-multi: Robust, distributed, dense metric-semantic slam for multi-robot systems,”IEEE Transactions on Robotics, vol. 38, no. 4, 2022

2022
[38]

Vision meets robotics: The kitti dataset,

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,”The international journal of robotics research, vol. 32, no. 11, pp. 1231–1237, 2013

2013
[39]

KISS-ICP: In Defense of Point-to-Point ICP – Simple, Accurate, and Robust Registration If Done the Right Way,

I. Vizzo, T. Guadagnino, B. Mersch, L. Wiesmann, J. Behley, and C. Stachniss, “KISS-ICP: In Defense of Point-to-Point ICP – Simple, Accurate, and Robust Registration If Done the Right Way,”IEEE Robotics and Automation Letters (RA-L), vol. 8, no. 2, pp. 1029–1036, 2023

2023
[40]

Direct lidar odometry: Fast localization with dense point clouds,

K. Chen, B. T. Lopez, A.-a. Agha-mohammadi, and A. Mehta, “Direct lidar odometry: Fast localization with dense point clouds,”IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2000–2007, 2022

2000
[41]

Closed-form solution of absolute orientation using unit quaternions,

B. K. Horn, “Closed-form solution of absolute orientation using unit quaternions,”Journal of the optical society of America A, vol. 4, no. 4, pp. 629–642, 1987

1987

[1] [1]

Part1 prelude,

L. Carlone, A. Kim, T. Barfoot, D. Cremers, and F. Dellaert, “Part1 prelude,” inSLAM Handbook. From Localization and Mapping to Spatial Intelligence, L. Carlone, A. Kim, T. Barfoot, D. Cremers, and F. Dellaert, Eds. Cambridge University Press, 2026

2026

[2] [2]

Satellite image-based localization via learned embeddings,

D.-K. Kim and M. R. Walter, “Satellite image-based localization via learned embeddings,” in2017 IEEE international conference on robotics and automation (ICRA). IEEE, 2017, pp. 2073–2080

2017

[3] [3]

Satellite image based cross-view localization for autonomous vehicle,

S. Wang, Y . Zhang, A. V ora, A. Perincherry, and H. Li, “Satellite image based cross-view localization for autonomous vehicle,” in2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 3592–3599

2023

[4] [4]

Any way you look at it: Semantic crossview localization and mapping with lidar,

I. D. Miller, A. Cowley, R. Konkimalla, S. S. Shivakumar, T. Nguyen, T. Smith, C. J. Taylor, and V . Kumar, “Any way you look at it: Semantic crossview localization and mapping with lidar,”IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 2397–2404, 2021

2021

[5] [5]

Adaptive teams of autonomous aerial and ground robots for situational aware- ness,

M. A. Hsieh, A. Cowley, J. F. Keller, L. Chaimowicz, B. Grocholsky, V . Kumar, C. J. Taylor, Y . Endo, R. C. Arkin, B. Jung,et al., “Adaptive teams of autonomous aerial and ground robots for situational aware- ness,”Journal of field robotics, vol. 24, no. 11-12, pp. 991–1014, 2007

2007

[6] [6]

Fgˆ 2: Fine-grained cross-view localization by fine-grained feature matching,

Z. Xia and A. Alahi, “Fgˆ 2: Fine-grained cross-view localization by fine-grained feature matching,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 6362–6372

2025

[7] [7]

Increasing slam pose accuracy by ground-to-satellite image registration,

Y . Zhang, Y . Shi, S. Wang, A. V ora, A. Perincherry, Y . Chen, and H. Li, “Increasing slam pose accuracy by ground-to-satellite image registration,” in2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 8522–8528

2024

[8] [8]

View from above: Orthogonal-view aware cross- view localization,

S. Wang, C. Nguyen, J. Liu, Y . Zhang, S. Muthu, F. A. Maken, K. Zhang, and H. Li, “View from above: Orthogonal-view aware cross- view localization,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 14 843–14 852

2024

[9] [9]

Beyond cross-view image retrieval: Highly accurate vehicle localization using satellite image,

Y . Shi and H. Li, “Beyond cross-view image retrieval: Highly accurate vehicle localization using satellite image,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022

2022

[10] [10]

Fast segment anything,

X. Zhao, W. Ding, Y . An, Y . Du, T. Yu, M. Li, M. Tang, and J. Wang, “Fast segment anything,”arXiv preprint arXiv:2306.12156, 2023

work page arXiv 2023

[11] [11]

Segment anything,

A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y . Lo,et al., “Segment anything,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 4015–4026

2023

[12] [12]

DINOv2: Learning Robust Visual Features without Supervision

M. Oquab, T. Darcet, T. Moutakanni, H. V o, M. Szafraniec, V . Khali- dov, P. Fernandez, D. Haziza, F. Massa, A. El-Nouby,et al., “Dinov2: Learning robust visual features without supervision,”arXiv preprint arXiv:2304.07193, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[13] [13]

Cross-view image geolocaliza- tion,

T.-Y . Lin, S. Belongie, and J. Hays, “Cross-view image geolocaliza- tion,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 891–898

2013

[14] [14]

Cross- view geo-localization: a survey,

A. Durgam, S. Paheding, V . Dhiman, and V . Devabhaktuni, “Cross- view geo-localization: a survey,”IEEE Access, 2024

2024

[15] [15]

Cvm-net: Cross-view matching network for image-based ground-to-aerial geo-localization,

S. Hu, M. Feng, R. M. Nguyen, and G. H. Lee, “Cvm-net: Cross-view matching network for image-based ground-to-aerial geo-localization,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7258–7267

2018

[16] [16]

City-wide street-to-satellite image geolocalization of a mobile ground agent,

L. M. Downes, D.-K. Kim, T. J. Steiner, and J. P. How, “City-wide street-to-satellite image geolocalization of a mobile ground agent,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 11 102–11 108

2022

[17] [17]

Revisiting cross-view localization from image matching,

P. Xia, Q. Wu, L. Yu, Y . Liu, M. Xiong, L. Liang, Y . Zhang, and Y . Wan, “Revisiting cross-view localization from image matching,” arXiv e-prints, pp. arXiv–2508, 2025

2025

[18] [18]

Uncertainty-aware vision-based metric cross-view geolocal- ization,

F. Fervers, S. Bullinger, C. Bodensteiner, M. Arens, and R. Stiefel- hagen, “Uncertainty-aware vision-based metric cross-view geolocal- ization,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 21 621–21 631

2023

[19] [19]

Geo-localization based on dynamically weighted factor- graph,

M. ´A. Mu ˜noz-Ba˜n´on, A. Olivas, E. Velasco-S ´anchez, F. A. Candelas, and F. Torres, “Geo-localization based on dynamically weighted factor- graph,”IEEE Robotics and Automation Letters, vol. 9, no. 6, pp. 5599– 5606, 2024

2024

[20] [20]

Global local- ization in unstructured environments using semantic object maps built from various viewpoints,

J. Ankenbauer, P. C. Lusk, A. Thomas, and J. P. How, “Global local- ization in unstructured environments using semantic object maps built from various viewpoints,” in2023 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2023, pp. 1358–1365

2023

[21] [21]

Osm-slam: Aiding slam with openstreetmaps priors,

M. Frosi, V . Gobbi, and M. Matteucci, “Osm-slam: Aiding slam with openstreetmaps priors,”Frontiers in Robotics and AI, vol. 10, p. 1064934, 2023

2023

[22] [22]

Global localization using openstreetmap and elevation offsets

A. Przewodowski, F. S. Os ´orio, and V . G. Junior, “Global localization using openstreetmap and elevation offsets.”J. Braz. Comput. Soc., vol. 30, no. 1, pp. 264–273, 2024

2024

[23] [23]

Odometry- assisted lidar-openstreetmap matching method for vehicle global po- sitioning,

Z. Li, R. Zuo, Y . Wang, F. Ding, C. Wei, and M. Wu, “Odometry- assisted lidar-openstreetmap matching method for vehicle global po- sitioning,”IEEE Internet of Things Journal, 2026

2026

[24] [24]

Autonomous vehicle localization without prior high-definition map,

S. Lee and J.-H. Ryu, “Autonomous vehicle localization without prior high-definition map,”IEEE Transactions on Robotics, vol. 40, pp. 2888–2906, 2024

2024

[25] [25]

Fast global registration,

Q.-Y . Zhou, J. Park, and V . Koltun, “Fast global registration,” in European conference on computer vision. Springer, 2016, pp. 766– 782

2016

[26] [26]

Teaser: Fast and certifiable point cloud registration,

H. Yang, J. Shi, and L. Carlone, “Teaser: Fast and certifiable point cloud registration,”IEEE Transactions on Robotics, vol. 37, no. 2, pp. 314–333, 2020

2020

[27] [27]

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,

M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,”Communications of the ACM, vol. 24, no. 6, pp. 381–395, 1981

1981

[28] [28]

CLIPPER: Robust Data Association without an Initial Guess,

P. C. Lusk and J. P. How, “CLIPPER: Robust Data Association without an Initial Guess,”IEEE Robotics and Automation Letters, 2024

2024

[29] [29]

Incremental-segment-based localization in 3-d point clouds,

R. Dub ´e, M. G. Gollub, H. Sommer, I. Gilitschenski, R. Siegwart, C. Cadena, and J. Nieto, “Incremental-segment-based localization in 3-d point clouds,”IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 1832–1839, 2018

2018

[30] [30]

ROMAN: Open-Set Object Map Alignment for Robust View- Invariant Global Localization,

M. B. Peterson, Y . X. Jia, Y . Tian, A. Thomas, and J. P. How, “ROMAN: Open-Set Object Map Alignment for Robust View- Invariant Global Localization,” inRobotics: Science and Systems (RSS), 2025

2025

[31] [31]

GraffMatch: Global matching of 3d lines and planes for wide baseline lidar registration,

P. C. Lusk, D. Parikh, and J. P. How, “GraffMatch: Global matching of 3d lines and planes for wide baseline lidar registration,”IEEE Robotics and Automation Letters, vol. 8, no. 2, pp. 632–639, 2022

2022

[32] [32]

Slim: Scalable and lightweight lidar mapping in urban environments,

Z. Yu, Z. Qiao, W. Liu, H. Yin, and S. Shen, “Slim: Scalable and lightweight lidar mapping in urban environments,”IEEE Transactions on Robotics, 2025

2025

[33] [33]

Distribution estimation for global data association via approximate bayesian infer- ence,

Y . Jia, M. B. Peterson, Q. Li, Y . Tian, and J. P. How, “Distribution estimation for global data association via approximate bayesian infer- ence,”arXiv preprint arXiv:2509.15565, 2025

work page arXiv 2025

[34] [34]

AnyLoc: Towards universal visual place recognition,

N. Keetha, A. Mishra, J. Karhade, K. M. Jatavallabhula, S. Scherer, M. Krishna, and S. Garg, “AnyLoc: Towards universal visual place recognition,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1286–1293, 2023

2023

[35] [35]

Least-squares fitting of two 3-d point sets,

K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,”IEEE Transactions on pattern analysis and machine intelligence, no. 5, pp. 698–700, 1987

1987

[36] [36]

Pair- wise consistent measurement set maximization for robust multi-robot map merging,

J. G. Mangelson, D. Dominic, R. M. Eustice, and R. Vasudevan, “Pair- wise consistent measurement set maximization for robust multi-robot map merging,” in2018 IEEE international conference on robotics and automation (ICRA). IEEE, 2018, pp. 2916–2923

2018

[37] [37]

Kimera-multi: Robust, distributed, dense metric-semantic slam for multi-robot systems,

Y . Tian, Y . Chang, F. H. Arias, C. Nieto-Granda, J. P. How, and L. Carlone, “Kimera-multi: Robust, distributed, dense metric-semantic slam for multi-robot systems,”IEEE Transactions on Robotics, vol. 38, no. 4, 2022

2022

[38] [38]

Vision meets robotics: The kitti dataset,

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,”The international journal of robotics research, vol. 32, no. 11, pp. 1231–1237, 2013

2013

[39] [39]

KISS-ICP: In Defense of Point-to-Point ICP – Simple, Accurate, and Robust Registration If Done the Right Way,

I. Vizzo, T. Guadagnino, B. Mersch, L. Wiesmann, J. Behley, and C. Stachniss, “KISS-ICP: In Defense of Point-to-Point ICP – Simple, Accurate, and Robust Registration If Done the Right Way,”IEEE Robotics and Automation Letters (RA-L), vol. 8, no. 2, pp. 1029–1036, 2023

2023

[40] [40]

Direct lidar odometry: Fast localization with dense point clouds,

K. Chen, B. T. Lopez, A.-a. Agha-mohammadi, and A. Mehta, “Direct lidar odometry: Fast localization with dense point clouds,”IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2000–2007, 2022

2000

[41] [41]

Closed-form solution of absolute orientation using unit quaternions,

B. K. Horn, “Closed-form solution of absolute orientation using unit quaternions,”Journal of the optical society of America A, vol. 4, no. 4, pp. 629–642, 1987

1987