arxiv: 2605.10760 · v1 · submitted 2026-05-11 · 💻 cs.RO

Recognition: 2 theorem links

· Lean Theorem

MAGS-SLAM: Monocular Multi-Agent Gaussian Splatting SLAM for Geometrically and Photometrically Consistent Reconstruction

Anh Nguyen, Baoru Huang, Jing Zhang, Qi Shao, Shuhao Zhai, Zhihao Cao

Pith reviewed 2026-05-12 04:16 UTC · model grok-4.3

classification 💻 cs.RO

keywords SLAMGaussian Splattingmulti-agentmonocularcollaborative reconstructionRGB-only mappingloop verification

0 comments

The pith

MAGS-SLAM lets multiple agents build a shared photorealistic 3D map from RGB images alone by exchanging compact submap summaries and fusing them consistently.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents MAGS-SLAM as the first RGB-only multi-agent 3D Gaussian Splatting SLAM system for collaborative scene reconstruction. Each agent runs its own monocular tracking to create local Gaussian submaps and sends only compact summaries to a central server instead of raw images or full maps. Geometry- and appearance-aware loop verification detects overlaps between agents despite unknown scales, while occupancy-aware Gaussian fusion merges the submaps into one coherent global model. Experiments on synthetic and real datasets show tracking accuracy and rendering quality that match or beat existing RGB-D collaborative methods. This removes the need for depth sensors, opening the approach to lightweight or low-power robot teams.

Core claim

MAGS-SLAM is the first RGB-only multi-agent 3DGS SLAM framework for collaborative scene reconstruction. Each agent independently builds local monocular Gaussian submaps and transmits compact submap summaries rather than raw observations or dense maps. The framework integrates compact submap communication, geometry- and appearance-aware loop verification, and occupancy-aware Gaussian fusion to resolve monocular scale ambiguity and produce coherent global maps without active depth sensors. On the introduced ReplicaMultiagent Plus benchmark and other datasets it achieves competitive tracking accuracy together with comparable or superior rendering quality to state-of-the-art RGB-D collaborative

What carries the argument

Compact submap communication combined with geometry- and appearance-aware loop verification and occupancy-aware Gaussian fusion, which together align independent monocular maps into a single consistent reconstruction.

If this is right

Collaborative reconstruction becomes possible on robots equipped only with standard cameras.
Loop verification maintains geometric and appearance consistency across agents despite independent scale estimates.
Occupancy-aware fusion produces a single coherent global Gaussian map from separate local submaps.
Real-time mapping and rendering performance is retained while eliminating depth-sensor hardware requirements.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same fusion strategy could be tested on agents with different camera intrinsics to check robustness beyond the current benchmark.
Extending the compact summary format to include dynamic object labels would allow the method to handle moving elements in the scene.
Running the system with larger teams of agents would reveal whether communication bandwidth or fusion time grows linearly.

Load-bearing premise

Geometry- and appearance-aware loop verification plus occupancy-aware Gaussian fusion can reliably resolve monocular scale ambiguity and produce coherent global maps from RGB images alone.

What would settle it

Run multiple agents on overlapping trajectories that induce large scale drift; check whether the fused map exhibits visible geometric distortions or photometric inconsistencies against ground-truth measurements.

Figures

Figures reproduced from arXiv: 2605.10760 by Anh Nguyen, Baoru Huang, Jing Zhang, Qi Shao, Shuhao Zhai, Zhihao Cao.

**Figure 1.** Figure 1: Overview of MAGS-SLAM. MAGS-SLAM reconstructs a globally consistent photorealistic 3D Gaussian map from [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: MAGS-SLAM Pipeline. Each agent runs JDSA-coupled dense BA on its RGB stream, back-projects inverse depth into [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Qualitative comparison on the ReplicaMultiagent dataset. We visualize reconstruction and rendering results of MAGS [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Qualitative comparison of multi-agent reconstruction on AiraMultiagent [ [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Qualitative ablation on Apart-2 from ReplicaMul [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Xbox-controller mapping used for trajectory record [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: 4 examples from the ReplicaMultiagent Plus benchmark. Each row shows a scene with four agent trajectories (colour [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Qualitative comparison of multi-agent reconstruction on our proposed ReplicaMultiagent Plus dataset. [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗

**Figure 9.** Figure 9: Qualitative results of MAGS-SLAM on the ReplicaMultiagent [ [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗

**Figure 10.** Figure 10: Qualitative results of MAGS-SLAM on the real-world indoor [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗

**Figure 11.** Figure 11: Qualitative results of MAGS-SLAM on the real-world indoor 7-Scenes dataset [9], and real-world outdoor Tanks and Temples [19] dataset. For the Tanks and Temples dataset, the two agent trajectories are formed by interleaved frame splitting [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗

read the original abstract

Collaborative photorealistic 3D reconstruction from multiple agents enables rapid large-scale scene capture for virtual production and cooperative multi-robot exploration. While recent 3D Gaussian Splatting (3DGS) SLAM algorithms can generate high-fidelity real-time mapping, most of the existing multi-agent Gaussian SLAM methods still rely on RGB-D sensors to obtain metric depth and simplify cross-agent alignment, which limits the deployment on lightweight, low-cost, or power-constrained robotic platforms. To address this challenge, we propose MAGS-SLAM, the first RGB-only multi-agent 3DGS SLAM framework for collaborative scene reconstruction. Each agent independently builds local monocular Gaussian submaps and transmits compact submap summaries rather than raw observations or dense maps. To facilitate robust collaboration in the presence of monocular scale ambiguity, our framework integrates compact submap communication, geometry- and appearance-aware loop verification, and occupancy-aware Gaussian fusion, enabling coherent global reconstruction without active depth sensors. We further introduce ReplicaMultiagent Plus benchmark for evaluating collaborative Gaussian SLAM. Intensive experiments on synthetic and real-world datasets show that MAGS-SLAM achieves competitive tracking accuracy and comparable or superior rendering quality to state-of-the-art RGB-D collaborative Gaussian SLAM methods while relying only RGB images.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MAGS-SLAM delivers the first working RGB-only multi-agent 3DGS SLAM with a coherent pipeline and competitive metrics against RGB-D baselines.

read the letter

Colleague, the main point is that this paper gives us the first monocular multi-agent 3D Gaussian Splatting SLAM that actually runs end-to-end without depth sensors. Each agent builds its own local submap from RGB, ships only compact summaries, then the system handles scale drift through geometry-plus-appearance loop checks and occupancy-based fusion to produce one consistent global map. That combination is the practical advance over prior RGB-D multi-agent Gaussian work. The pipeline holds together: independent submap construction, summary transmission, verification, and fusion are all described without internal contradictions or hidden leaps. Experiments on the new ReplicaMultiagent Plus benchmark plus real sequences show tracking and rendering numbers that sit at or above the RGB-D baselines, which is the result that matters for deployment on lighter platforms. The new benchmark itself is a useful side contribution for testing collaborative setups. The soft spots are limited and mostly practical. There are still a few thresholds in the verification and fusion stages that need setting, so performance could shift with environment or tuning. The quantitative tables and ablations are present but would benefit from more error bars or cross-dataset breakdowns to strengthen the claims. No load-bearing fitting or circular reasoning shows up. This is for robotics groups doing multi-robot exploration or virtual production pipelines that want to drop the depth hardware. A reader already working with Gaussian Splatting or collaborative SLAM will get concrete methods and numbers they can build on. It deserves a serious referee because the core claim is new, the construction is reproducible in principle, and the evaluation is grounded enough to warrant external scrutiny rather than a desk rejection.

Referee Report

0 major / 3 minor

Summary. The paper introduces MAGS-SLAM, the first RGB-only multi-agent 3D Gaussian Splatting SLAM framework. Each agent constructs independent monocular Gaussian submaps and transmits compact summaries; geometry- and appearance-aware loop verification together with occupancy-aware fusion resolve scale ambiguity and produce coherent global maps. A new ReplicaMultiagent Plus benchmark is presented, and experiments on synthetic and real sequences report competitive tracking accuracy and comparable or superior rendering quality relative to state-of-the-art RGB-D collaborative Gaussian SLAM baselines.

Significance. If the empirical claims hold, the work is significant for enabling photorealistic collaborative 3DGS reconstruction on platforms that cannot carry depth sensors. The compact-summary communication and the explicit mechanisms for monocular scale handling are practical contributions that could broaden deployment in multi-robot exploration and virtual production. The new benchmark is a useful community resource.

minor comments (3)

[Abstract] Abstract: the phrase 'competitive tracking accuracy' and 'comparable or superior rendering quality' should be accompanied by the specific metrics (e.g., ATE, PSNR) and the exact baseline methods being compared.
[§3.3] §3.3 (loop verification): the geometry- and appearance-aware verification thresholds are listed among the free parameters; a brief sensitivity analysis or default values would strengthen reproducibility.
[Experiments] Experiments section: confirm that all reported numbers include standard deviations or multiple runs, and that the RGB-D baselines do not receive extra information beyond what the RGB-only system is denied.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive review and recommendation of minor revision. The recognition of MAGS-SLAM's practical contributions in RGB-only multi-agent 3DGS SLAM, including compact submap sharing and monocular scale handling, is appreciated. We will incorporate any minor suggestions during revision.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The manuscript presents an engineering pipeline for multi-agent monocular 3DGS SLAM built from independent submap construction, compact summary exchange, geometry-appearance loop verification, and occupancy-aware fusion. These modules are described as novel algorithmic components rather than derived quantities obtained by fitting parameters to the target outputs or by self-referential definitions. No equations, uniqueness theorems, or ansatzes are shown to reduce to their own inputs by construction, and the evaluation relies on external benchmarks (ReplicaMultiagent Plus and real sequences) rather than internal self-consistency alone. The central claims therefore remain self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The framework rests on standard SLAM assumptions plus new algorithmic components; specific free parameters and exact axioms cannot be audited from the abstract alone.

free parameters (1)

loop verification thresholds and fusion occupancy parameters
Typical hand-tuned or data-fitted values required for robust alignment and merging in multi-agent settings.

axioms (1)

domain assumption Monocular scale ambiguity can be resolved via cross-agent geometry-appearance loop verification and occupancy-aware fusion
This premise is required for the central claim that RGB-only operation suffices for coherent global reconstruction.

invented entities (1)

compact submap summaries no independent evidence
purpose: Efficient communication of local Gaussian maps between agents without transmitting raw images or dense maps
New data structure introduced to enable scalable multi-agent collaboration.

pith-pipeline@v0.9.0 · 5542 in / 1293 out tokens · 48147 ms · 2026-05-12T04:16:38.629339+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
MAGS-SLAM integrates compact submap communication, geometry- and appearance-aware loop verification, and occupancy-aware Gaussian fusion... Sim(3) submap pose graph... Lgraph with rgeo and rpho residuals
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
LJDSA = LBA + sum ||B_k d_k - d_prior_k||^2; Lmap = α L1(Î,I) + β L1(ˆD,D) + ...

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages · 1 internal anchor

[1]

Paul J Besl and Neil D McKay. 1992. Method for registration of 3-D shapes. In Sensor fusion IV: control paradigms and data structures, Vol. 1611. Spie, 586–606

work page 1992
[2]

Carlos Campos, Richard Elvira, Juan J Gómez Rodríguez, José MM Montiel, and Juan D Tardós. 2021. Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam.IEEE transactions on robotics37, 6 (2021), 1874–1890

work page 2021
[3]

Zhihao Cao, Hanyu Wu, Li Wa Tang, Zizhou Luo, Wei Zhang, Marc Pollefeys, Zihan Zhu, and Martin R Oswald. 2025. Mcgs-slam: A multi-camera slam framework using gaussian splatting for high-fidelity mapping.arXiv preprint arXiv:2509.14191(2025)

work page arXiv 2025
[4]

Lin Chen, Yongxin Su, Jvboxi Wang, Pengcheng Han, Zhenyu Xia, Shuhui Bu, Kun Li, Boni Hu, Shengqi Meng, and Guangming Wang. 2026. CoMA-SLAM: Collaborative Multi-Agent Gaussian SLAM with Geometric Consistency. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 2922–2929

work page 2026
[5]

Tianchen Deng, Guole Shen, Xun Chen, Shenghai Yuan, Hongming Shen, Guohao Peng, Zhenyu Wu, Jingchuan Wang, Lihua Xie, Danwei Wang, et al. 2025. Mcn- slam: Multi-agent collaborative neural slam with hybrid implicit neural scene representation.arXiv preprint arXiv:2506.18678(2025)

work page arXiv 2025
[6]

Tianchen Deng, Guole Shen, Chen Xun, Shenghai Yuan, Tongxin Jin, Hongming Shen, Yanbo Wang, Jingchuan Wang, Hesheng Wang, Danwei Wang, et al. 2025. Mne-slam: Multi-agent neural slam for mobile robots. InProceedings of the Computer Vision and Pattern Recognition Conference. 1485–1494

work page 2025
[7]

Jakob Engel, Vladlen Koltun, and Daniel Cremers. 2017. Direct sparse odometry. IEEE transactions on pattern analysis and machine intelligence40, 3 (2017), 611– 625

work page 2017
[8]

Jakob Engel, Thomas Schöps, and Daniel Cremers. 2014. LSD-SLAM: Large-scale direct monocular SLAM. InEuropean conference on computer vision. Springer, 834–849

work page 2014
[9]

Ben Glocker, Shahram Izadi, Jamie Shotton, and Antonio Criminisi. 2013. Real- time RGB-D camera relocalization. In2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 173–179

work page 2013
[10]

Seongbo Ha, Jiung Yeon, and Hyeonwoo Yu. 2024. Rgbd gs-icp slam. InEuropean conference on computer vision. Springer, 180–197

work page 2024
[11]

Jiarui Hu, Xianhao Chen, Boyin Feng, Guanglin Li, Liangjing Yang, Hujun Bao, Guofeng Zhang, and Zhaopeng Cui. 2024. Cg-slam: Efficient dense rgb-d slam in a consistent uncertainty-aware 3d gaussian field. InEuropean Conference on Computer Vision. Springer, 93–112

work page 2024
[12]

Jiarui Hu, Mao Mao, Hujun Bao, Guofeng Zhang, and Zhaopeng Cui. 2023. Cp-slam: Collaborative neural point-based slam system.Advances in Neural Information Processing Systems36 (2023), 39429–39442

work page 2023
[13]

Mu Hu, Wei Yin, Chi Zhang, Zhipeng Cai, Xiaoxiao Long, Hao Chen, Kaixuan Wang, Gang Yu, Chunhua Shen, and Shaojie Shen. 2024. Metric3d v2: A versatile monocular geometric foundation model for zero-shot metric depth and surface normal estimation.IEEE Transactions on Pattern Analysis and Machine Intelligence 46, 12 (2024), 10579–10596

work page 2024
[14]

Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao

work page
[15]

InACM SIGGRAPH 2024 conference papers

2d gaussian splatting for geometrically accurate radiance fields. InACM SIGGRAPH 2024 conference papers. 1–11

work page 2024
[16]

Huajian Huang, Longwei Li, Hui Cheng, and Sai-Kit Yeung. 2024. Photo-slam: Real-time simultaneous localization and photorealistic mapping for monocular stereo and rgb-d cameras. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 21584–21593

work page 2024
[17]

Mohammad Mahdi Johari, Camilla Carta, and François Fleuret. 2023. Eslam: Efficient dense slam system based on hybrid representation of signed distance fields. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 17408–17419

work page 2023
[18]

Nikhil Keetha, Jay Karhade, Krishna Murthy Jatavallabhula, Gengshan Yang, Sebastian Scherer, Deva Ramanan, and Jonathon Luiten. 2024. Splatam: Splat track & map 3d gaussians for dense rgb-d slam. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 21357–21366

work page 2024
[19]

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, George Drettakis, et al

work page
[20]

Graph.42, 4 (2023), 139–1

3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph.42, 4 (2023), 139–1

work page 2023
[21]

Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun. 2017. Tanks and temples: Benchmarking large-scale scene reconstruction.ACM Transactions on Graphics (ToG)36, 4 (2017), 1–13

work page 2017
[22]

Pierre-Yves Lajoie and Giovanni Beltrame. 2023. Swarm-slam: Sparse decen- tralized collaborative simultaneous localization and mapping framework for multi-robot systems.IEEE Robotics and Automation Letters9, 1 (2023), 475–482

work page 2023
[23]

Pierre-Yves Lajoie, Benjamin Ramtoula, Yun Chang, Luca Carlone, and Giovanni Beltrame. 2020. DOOR-SLAM: Distributed, online, and outlier resilient SLAM for robotic teams.IEEE Robotics and Automation Letters5, 2 (2020), 1656–1663

work page 2020
[24]

Kenneth Levenberg. 1944. A method for the solution of certain non-linear problems in least squares.Quarterly of applied mathematics2, 2 (1944), 164–168

work page 1944
[25]

Mingrui Li, Shuhong Liu, Heng Zhou, Guohao Zhu, Na Cheng, Tianchen Deng, and Hongyu Wang. 2024. Sgs-slam: Semantic gaussian splatting for neural dense slam. InEuropean Conference on Computer Vision. Springer, 163–179

work page 2024
[26]

Yonghao Li, Ping Ye, and Qingxuan Jia. 2025. MANG-SLAM: Multi-Agent Neural Submap and Gaussian Representation for Dense Mapping.IEEE Robotics and Automation Letters11, 2 (2025), 2242–2249

work page 2025
[27]

Lahav Lipson and Jia Deng. 2024. Multi-session slam with differentiable wide- baseline pose optimization. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition. 19626–19635

work page 2024
[28]

Lorenzo Liso, Erik Sandström, Vladimir Yugay, Luc Van Gool, and Martin R Oswald. 2024. Loopy-slam: Dense neural slam with loop closures. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 20363– 20373

work page 2024
[29]

Hidenobu Matsuki, Riku Murai, Paul HJ Kelly, and Andrew J Davison. 2024. Gaussian splatting slam. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 18039–18048

work page 2024
[30]

Riku Murai, Eric Dexheimer, and Andrew J Davison. 2025. Mast3r-slam: Real- time dense slam with 3d reconstruction priors. InProceedings of the Computer Vision and Pattern Recognition Conference. 16695–16705

work page 2025
[31]

Richard A Newcombe, Steven J Lovegrove, and Andrew J Davison. 2011. DTAM: Dense tracking and mapping in real-time. In2011 international conference on MAGS-SLAM: Monocular Multi-Agent Gaussian Splatting SLAM for Geometrically and Photometrically Consistent Reconstruction computer vision. IEEE, 2320–2327

work page 2011
[32]

2006.Numerical optimization

Jorge Nocedal and Stephen J Wright. 2006.Numerical optimization. Springer

work page 2006
[33]

Xiaqing Pan, Nicholas Charron, Yongqian Yang, Scott Peters, Thomas Whelan, Chen Kong, Omkar Parkhi, Richard Newcombe, and Yuheng Carl Ren. 2023. Aria digital twin: A new benchmark dataset for egocentric 3d machine perception. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 20133– 20143

work page 2023
[34]

Zhexi Peng, Tianjia Shao, Yong Liu, Jingke Zhou, Yin Yang, Jingdong Wang, and Kun Zhou. 2024. Rtg-slam: Real-time 3d reconstruction at scale using gaussian splatting. InACM SIGGRAPH 2024 conference papers. 1–11

work page 2024
[35]

Xavier Puig, Eric Undersander, Andrew Szot, Mikael Dallaire Cote, Tsung-Yen Yang, Ruslan Partsey, Ruta Desai, Alexander Clegg, Michal Hlavac, So Yeon Min, et al. 2024. Habitat 3.0: A co-habitat for humans, avatars, and robots. In International Conference on Learning Representations, Vol. 2024. 15306–15336

work page 2024
[36]

Antoni Rosinol, John J Leonard, and Luca Carlone. 2023. Nerf-slam: Real-time dense monocular slam with neural radiance fields. In2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 3437–3444

work page 2023
[37]

Erik Sandström, Yue Li, Luc Van Gool, and Martin R Oswald. 2023. Point-slam: Dense neural point cloud-based slam. InProceedings of the IEEE/CVF international conference on computer vision. 18433–18444

work page 2023
[38]

Erik Sandström, Ganlin Zhang, Keisuke Tateno, Michael Oechsle, Michael Niemeyer, Youmin Zhang, Manthan Patel, Luc Van Gool, Martin Oswald, and Federico Tombari. 2025. Splat-slam: Globally optimized rgb-only slam with 3d gaussians. InProceedings of the Computer Vision and Pattern Recognition Conference. 1680–1691

work page 2025
[39]

Patrik Schmuck and Margarita Chli. 2019. CCM-SLAM: Robust and efficient centralized collaborative monocular simultaneous localization and mapping for robotic teams.Journal of Field Robotics36, 4 (2019), 763–781

work page 2019
[40]

Patrik Schmuck, Thomas Ziegler, Marco Karrer, Jonathan Perraudin, and Mar- garita Chli. 2021. Covins: Visual-inertial slam for centralized collaboration.arXiv preprint arXiv:2108.05756(2021)

work page arXiv 2021
[41]

Thomas Schops, Torsten Sattler, and Marc Pollefeys. 2019. Bad slam: Bundle adjusted direct rgb-d slam. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 134–144

work page 2019
[42]

Julian Straub, Thomas Whelan, Lingni Ma, Yufan Chen, Erik Wijmans, Simon Green, Jakob J Engel, Raul Mur-Artal, Carl Ren, Shobhit Verma, et al. 2019. The replica dataset: A digital replica of indoor spaces.arXiv preprint arXiv:1906.05797 (2019)

work page internal anchor Pith review arXiv 2019
[43]

Edgar Sucar, Shikun Liu, Joseph Ortiz, and Andrew J Davison. 2021. imap: Implicit mapping and positioning in real-time. InProceedings of the IEEE/CVF international conference on computer vision. 6229–6238

work page 2021
[44]

Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, and Xiaowei Zhou. 2021. LoFTR: Detector-free local feature matching with transformers. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8922–8931

work page 2021
[45]

Zachary Teed and Jia Deng. 2021. Droid-slam: Deep visual slam for monocular, stereo, and rgb-d cameras.Advances in neural information processing systems34 (2021), 16558–16569

work page 2021
[46]

Zachary Teed, Lahav Lipson, and Jia Deng. 2023. Deep patch visual odometry. Advances in Neural Information Processing Systems36 (2023), 39033–39051

work page 2023
[47]

Annika Thomas, Aneesa Sonawalla, Alex Rose, and Jonathan P How. 2025. GRAND-SLAM: Local Optimization for Globally Consistent Large-Scale Multi- Agent Gaussian SLAM.IEEE Robotics and Automation Letters(2025)

work page 2025
[48]

Yulun Tian, Yun Chang, Fernando Herrera Arias, Carlos Nieto-Granda, Jonathan P How, and Luca Carlone. 2022. Kimera-multi: Robust, distributed, dense metric-semantic slam for multi-robot systems.IEEE transactions on robotics 38, 4 (2022)

work page 2022
[49]

Shinji Umeyama. 1991. Least-squares estimation of transformation parameters between two point patterns.IEEE Transactions on pattern analysis and machine intelligence13, 4 (1991), 376–380

work page 1991
[50]

Hengyi Wang, Jingwen Wang, and Lourdes Agapito. 2023. Co-slam: Joint coordi- nate and sparse parametric encodings for neural real-time slam. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 13293–13302

work page 2023
[51]

Xiaohao Xu, Feng Xue, Shibo Zhao, Yike Pan, Sebastian Scherer, and Xiaonan Huang. 2025. Mac-ego3d: Multi-agent gaussian consensus for real-time collab- orative ego-motion and photorealistic 3d reconstruction. InProceedings of the Computer Vision and Pattern Recognition Conference. 854–863

work page 2025
[52]

Chi Yan, Delin Qu, Dan Xu, Bin Zhao, Zhigang Wang, Dong Wang, and Xuelong Li. 2024. Gs-slam: Dense visual slam with 3d gaussian splatting. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 19595–19604

work page 2024
[53]

Xingrui Yang, Hai Li, Hongjia Zhai, Yuhang Ming, Yuqian Liu, and Guofeng Zhang. 2022. Vox-fusion: Dense tracking and mapping with voxel-based neural implicit representation. In2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 499–507

work page 2022
[54]

Javier Yu, Timothy Chen, and Mac Schwager. 2025. Hammer: heterogeneous, multi-robot semantic gaussian splatting.IEEE Robotics and Automation Letters (2025)

work page 2025
[55]

Vladimir Yugay, Theo Gevers, and Martin R Oswald. 2025. Magic-slam: Multi- agent gaussian globally consistent slam. InProceedings of the Computer Vision and Pattern Recognition Conference. 6741–6750

work page 2025
[56]

Vladimir Yugay, Yue Li, Theo Gevers, and Martin R Oswald. 2023. Gaussian- slam: Photo-realistic dense slam with gaussian splatting.arXiv preprint arXiv:2312.10070(2023)

work page arXiv 2023
[57]

Wei Zhang, Qing Cheng, David Skuddis, Niclas Zeller, Daniel Cremers, and Nor- bert Haala. 2025. Hi-slam2: Geometry-aware gaussian slam for fast monocular scene reconstruction.IEEE Transactions on Robotics41 (2025), 6478–6493

work page 2025
[58]

Wei Zhang, Tiecheng Sun, Sen Wang, Qing Cheng, and Norbert Haala. 2023. Hi-slam: Monocular real-time dense mapping with hybrid implicit fields.IEEE Robotics and Automation Letters9, 2 (2023), 1548–1555

work page 2023
[59]

Youmin Zhang, Fabio Tosi, Stefano Mattoccia, and Matteo Poggi. 2023. Go-slam: Global optimization for consistent 3d instant reconstruction. InProceedings of the IEEE/CVF international conference on computer vision. 3727–3737

work page 2023
[60]

Yuchen Zhou and Haihang Wu. 2025. Multi-Agent Monocular Dense SLAM With 3D Reconstruction Priors.arXiv preprint arXiv:2511.19031(2025)

work page arXiv 2025
[61]

Liyuan Zhu, Yue Li, Erik Sandström, Shengyu Huang, Konrad Schindler, and Iro Armeni. 2025. Loopsplat: Loop closure by registering 3d gaussian splats. In2025 International Conference on 3D Vision (3DV). IEEE, 156–167

work page 2025
[62]

Zihan Zhu, Songyou Peng, Viktor Larsson, Zhaopeng Cui, Martin R Oswald, Andreas Geiger, and Marc Pollefeys. 2024. Nicer-slam: Neural implicit scene encoding for rgb slam. In2024 International Conference on 3D Vision (3DV). IEEE, 42–52

work page 2024
[63]

Zihan Zhu, Songyou Peng, Viktor Larsson, Weiwei Xu, Hujun Bao, Zhaopeng Cui, Martin R Oswald, and Marc Pollefeys. 2022. Nice-slam: Neural implicit scalable encoding for slam. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12786–12796

work page 2022
[64]

Zihan Zhu, Wei Zhang, Moyang Li, Norbert Haala, Marc Pollefeys, and Daniel Barath. 2025. Vigs-slam: visual inertial Gaussian splatting slam.arXiv preprint arXiv:2512.02293(2025). A The ReplicaMultiagent Plus Benchmark Existing multi-agent SLAM benchmarks force a trade-off between photometric realism, agent count, and ground-truth completeness. Real-world ...

work page arXiv 2025