pith. machine review for the scientific record. sign in

arxiv: 2605.10760 · v1 · submitted 2026-05-11 · 💻 cs.RO

Recognition: 2 theorem links

· Lean Theorem

MAGS-SLAM: Monocular Multi-Agent Gaussian Splatting SLAM for Geometrically and Photometrically Consistent Reconstruction

Anh Nguyen, Baoru Huang, Jing Zhang, Qi Shao, Shuhao Zhai, Zhihao Cao

Pith reviewed 2026-05-12 04:16 UTC · model grok-4.3

classification 💻 cs.RO
keywords SLAMGaussian Splattingmulti-agentmonocularcollaborative reconstructionRGB-only mappingloop verification
0
0 comments X

The pith

MAGS-SLAM lets multiple agents build a shared photorealistic 3D map from RGB images alone by exchanging compact submap summaries and fusing them consistently.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents MAGS-SLAM as the first RGB-only multi-agent 3D Gaussian Splatting SLAM system for collaborative scene reconstruction. Each agent runs its own monocular tracking to create local Gaussian submaps and sends only compact summaries to a central server instead of raw images or full maps. Geometry- and appearance-aware loop verification detects overlaps between agents despite unknown scales, while occupancy-aware Gaussian fusion merges the submaps into one coherent global model. Experiments on synthetic and real datasets show tracking accuracy and rendering quality that match or beat existing RGB-D collaborative methods. This removes the need for depth sensors, opening the approach to lightweight or low-power robot teams.

Core claim

MAGS-SLAM is the first RGB-only multi-agent 3DGS SLAM framework for collaborative scene reconstruction. Each agent independently builds local monocular Gaussian submaps and transmits compact submap summaries rather than raw observations or dense maps. The framework integrates compact submap communication, geometry- and appearance-aware loop verification, and occupancy-aware Gaussian fusion to resolve monocular scale ambiguity and produce coherent global maps without active depth sensors. On the introduced ReplicaMultiagent Plus benchmark and other datasets it achieves competitive tracking accuracy together with comparable or superior rendering quality to state-of-the-art RGB-D collaborative

What carries the argument

Compact submap communication combined with geometry- and appearance-aware loop verification and occupancy-aware Gaussian fusion, which together align independent monocular maps into a single consistent reconstruction.

If this is right

  • Collaborative reconstruction becomes possible on robots equipped only with standard cameras.
  • Loop verification maintains geometric and appearance consistency across agents despite independent scale estimates.
  • Occupancy-aware fusion produces a single coherent global Gaussian map from separate local submaps.
  • Real-time mapping and rendering performance is retained while eliminating depth-sensor hardware requirements.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same fusion strategy could be tested on agents with different camera intrinsics to check robustness beyond the current benchmark.
  • Extending the compact summary format to include dynamic object labels would allow the method to handle moving elements in the scene.
  • Running the system with larger teams of agents would reveal whether communication bandwidth or fusion time grows linearly.

Load-bearing premise

Geometry- and appearance-aware loop verification plus occupancy-aware Gaussian fusion can reliably resolve monocular scale ambiguity and produce coherent global maps from RGB images alone.

What would settle it

Run multiple agents on overlapping trajectories that induce large scale drift; check whether the fused map exhibits visible geometric distortions or photometric inconsistencies against ground-truth measurements.

Figures

Figures reproduced from arXiv: 2605.10760 by Anh Nguyen, Baoru Huang, Jing Zhang, Qi Shao, Shuhao Zhai, Zhihao Cao.

Figure 1
Figure 1. Figure 1: Overview of MAGS-SLAM. MAGS-SLAM reconstructs a globally consistent photorealistic 3D Gaussian map from [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: MAGS-SLAM Pipeline. Each agent runs JDSA-coupled dense BA on its RGB stream, back-projects inverse depth into [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative comparison on the ReplicaMultiagent dataset. We visualize reconstruction and rendering results of MAGS [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative comparison of multi-agent reconstruction on AiraMultiagent [ [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative ablation on Apart-2 from ReplicaMul [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Xbox-controller mapping used for trajectory record [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: 4 examples from the ReplicaMultiagent Plus benchmark. Each row shows a scene with four agent trajectories (colour [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Qualitative comparison of multi-agent reconstruction on our proposed ReplicaMultiagent Plus dataset. [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Qualitative results of MAGS-SLAM on the ReplicaMultiagent [ [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Qualitative results of MAGS-SLAM on the real-world indoor [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Qualitative results of MAGS-SLAM on the real-world indoor 7-Scenes dataset [9], and real-world outdoor Tanks and Temples [19] dataset. For the Tanks and Temples dataset, the two agent trajectories are formed by interleaved frame splitting [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗
read the original abstract

Collaborative photorealistic 3D reconstruction from multiple agents enables rapid large-scale scene capture for virtual production and cooperative multi-robot exploration. While recent 3D Gaussian Splatting (3DGS) SLAM algorithms can generate high-fidelity real-time mapping, most of the existing multi-agent Gaussian SLAM methods still rely on RGB-D sensors to obtain metric depth and simplify cross-agent alignment, which limits the deployment on lightweight, low-cost, or power-constrained robotic platforms. To address this challenge, we propose MAGS-SLAM, the first RGB-only multi-agent 3DGS SLAM framework for collaborative scene reconstruction. Each agent independently builds local monocular Gaussian submaps and transmits compact submap summaries rather than raw observations or dense maps. To facilitate robust collaboration in the presence of monocular scale ambiguity, our framework integrates compact submap communication, geometry- and appearance-aware loop verification, and occupancy-aware Gaussian fusion, enabling coherent global reconstruction without active depth sensors. We further introduce ReplicaMultiagent Plus benchmark for evaluating collaborative Gaussian SLAM. Intensive experiments on synthetic and real-world datasets show that MAGS-SLAM achieves competitive tracking accuracy and comparable or superior rendering quality to state-of-the-art RGB-D collaborative Gaussian SLAM methods while relying only RGB images.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper introduces MAGS-SLAM, the first RGB-only multi-agent 3D Gaussian Splatting SLAM framework. Each agent constructs independent monocular Gaussian submaps and transmits compact summaries; geometry- and appearance-aware loop verification together with occupancy-aware fusion resolve scale ambiguity and produce coherent global maps. A new ReplicaMultiagent Plus benchmark is presented, and experiments on synthetic and real sequences report competitive tracking accuracy and comparable or superior rendering quality relative to state-of-the-art RGB-D collaborative Gaussian SLAM baselines.

Significance. If the empirical claims hold, the work is significant for enabling photorealistic collaborative 3DGS reconstruction on platforms that cannot carry depth sensors. The compact-summary communication and the explicit mechanisms for monocular scale handling are practical contributions that could broaden deployment in multi-robot exploration and virtual production. The new benchmark is a useful community resource.

minor comments (3)
  1. [Abstract] Abstract: the phrase 'competitive tracking accuracy' and 'comparable or superior rendering quality' should be accompanied by the specific metrics (e.g., ATE, PSNR) and the exact baseline methods being compared.
  2. [§3.3] §3.3 (loop verification): the geometry- and appearance-aware verification thresholds are listed among the free parameters; a brief sensitivity analysis or default values would strengthen reproducibility.
  3. [Experiments] Experiments section: confirm that all reported numbers include standard deviations or multiple runs, and that the RGB-D baselines do not receive extra information beyond what the RGB-only system is denied.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive review and recommendation of minor revision. The recognition of MAGS-SLAM's practical contributions in RGB-only multi-agent 3DGS SLAM, including compact submap sharing and monocular scale handling, is appreciated. We will incorporate any minor suggestions during revision.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The manuscript presents an engineering pipeline for multi-agent monocular 3DGS SLAM built from independent submap construction, compact summary exchange, geometry-appearance loop verification, and occupancy-aware fusion. These modules are described as novel algorithmic components rather than derived quantities obtained by fitting parameters to the target outputs or by self-referential definitions. No equations, uniqueness theorems, or ansatzes are shown to reduce to their own inputs by construction, and the evaluation relies on external benchmarks (ReplicaMultiagent Plus and real sequences) rather than internal self-consistency alone. The central claims therefore remain self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The framework rests on standard SLAM assumptions plus new algorithmic components; specific free parameters and exact axioms cannot be audited from the abstract alone.

free parameters (1)
  • loop verification thresholds and fusion occupancy parameters
    Typical hand-tuned or data-fitted values required for robust alignment and merging in multi-agent settings.
axioms (1)
  • domain assumption Monocular scale ambiguity can be resolved via cross-agent geometry-appearance loop verification and occupancy-aware fusion
    This premise is required for the central claim that RGB-only operation suffices for coherent global reconstruction.
invented entities (1)
  • compact submap summaries no independent evidence
    purpose: Efficient communication of local Gaussian maps between agents without transmitting raw images or dense maps
    New data structure introduced to enable scalable multi-agent collaboration.

pith-pipeline@v0.9.0 · 5542 in / 1293 out tokens · 48147 ms · 2026-05-12T04:16:38.629339+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages · 1 internal anchor

  1. [1]

    Paul J Besl and Neil D McKay. 1992. Method for registration of 3-D shapes. In Sensor fusion IV: control paradigms and data structures, Vol. 1611. Spie, 586–606

  2. [2]

    Carlos Campos, Richard Elvira, Juan J Gómez Rodríguez, José MM Montiel, and Juan D Tardós. 2021. Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam.IEEE transactions on robotics37, 6 (2021), 1874–1890

  3. [3]

    Zhihao Cao, Hanyu Wu, Li Wa Tang, Zizhou Luo, Wei Zhang, Marc Pollefeys, Zihan Zhu, and Martin R Oswald. 2025. Mcgs-slam: A multi-camera slam framework using gaussian splatting for high-fidelity mapping.arXiv preprint arXiv:2509.14191(2025)

  4. [4]

    Lin Chen, Yongxin Su, Jvboxi Wang, Pengcheng Han, Zhenyu Xia, Shuhui Bu, Kun Li, Boni Hu, Shengqi Meng, and Guangming Wang. 2026. CoMA-SLAM: Collaborative Multi-Agent Gaussian SLAM with Geometric Consistency. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 2922–2929

  5. [5]

    Tianchen Deng, Guole Shen, Xun Chen, Shenghai Yuan, Hongming Shen, Guohao Peng, Zhenyu Wu, Jingchuan Wang, Lihua Xie, Danwei Wang, et al. 2025. Mcn- slam: Multi-agent collaborative neural slam with hybrid implicit neural scene representation.arXiv preprint arXiv:2506.18678(2025)

  6. [6]

    Tianchen Deng, Guole Shen, Chen Xun, Shenghai Yuan, Tongxin Jin, Hongming Shen, Yanbo Wang, Jingchuan Wang, Hesheng Wang, Danwei Wang, et al. 2025. Mne-slam: Multi-agent neural slam for mobile robots. InProceedings of the Computer Vision and Pattern Recognition Conference. 1485–1494

  7. [7]

    Jakob Engel, Vladlen Koltun, and Daniel Cremers. 2017. Direct sparse odometry. IEEE transactions on pattern analysis and machine intelligence40, 3 (2017), 611– 625

  8. [8]

    Jakob Engel, Thomas Schöps, and Daniel Cremers. 2014. LSD-SLAM: Large-scale direct monocular SLAM. InEuropean conference on computer vision. Springer, 834–849

  9. [9]

    Ben Glocker, Shahram Izadi, Jamie Shotton, and Antonio Criminisi. 2013. Real- time RGB-D camera relocalization. In2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 173–179

  10. [10]

    Seongbo Ha, Jiung Yeon, and Hyeonwoo Yu. 2024. Rgbd gs-icp slam. InEuropean conference on computer vision. Springer, 180–197

  11. [11]

    Jiarui Hu, Xianhao Chen, Boyin Feng, Guanglin Li, Liangjing Yang, Hujun Bao, Guofeng Zhang, and Zhaopeng Cui. 2024. Cg-slam: Efficient dense rgb-d slam in a consistent uncertainty-aware 3d gaussian field. InEuropean Conference on Computer Vision. Springer, 93–112

  12. [12]

    Jiarui Hu, Mao Mao, Hujun Bao, Guofeng Zhang, and Zhaopeng Cui. 2023. Cp-slam: Collaborative neural point-based slam system.Advances in Neural Information Processing Systems36 (2023), 39429–39442

  13. [13]

    Mu Hu, Wei Yin, Chi Zhang, Zhipeng Cai, Xiaoxiao Long, Hao Chen, Kaixuan Wang, Gang Yu, Chunhua Shen, and Shaojie Shen. 2024. Metric3d v2: A versatile monocular geometric foundation model for zero-shot metric depth and surface normal estimation.IEEE Transactions on Pattern Analysis and Machine Intelligence 46, 12 (2024), 10579–10596

  14. [14]

    Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao

  15. [15]

    InACM SIGGRAPH 2024 conference papers

    2d gaussian splatting for geometrically accurate radiance fields. InACM SIGGRAPH 2024 conference papers. 1–11

  16. [16]

    Huajian Huang, Longwei Li, Hui Cheng, and Sai-Kit Yeung. 2024. Photo-slam: Real-time simultaneous localization and photorealistic mapping for monocular stereo and rgb-d cameras. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 21584–21593

  17. [17]

    Mohammad Mahdi Johari, Camilla Carta, and François Fleuret. 2023. Eslam: Efficient dense slam system based on hybrid representation of signed distance fields. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 17408–17419

  18. [18]

    Nikhil Keetha, Jay Karhade, Krishna Murthy Jatavallabhula, Gengshan Yang, Sebastian Scherer, Deva Ramanan, and Jonathon Luiten. 2024. Splatam: Splat track & map 3d gaussians for dense rgb-d slam. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 21357–21366

  19. [19]

    Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, George Drettakis, et al

  20. [20]

    Graph.42, 4 (2023), 139–1

    3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph.42, 4 (2023), 139–1

  21. [21]

    Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun. 2017. Tanks and temples: Benchmarking large-scale scene reconstruction.ACM Transactions on Graphics (ToG)36, 4 (2017), 1–13

  22. [22]

    Pierre-Yves Lajoie and Giovanni Beltrame. 2023. Swarm-slam: Sparse decen- tralized collaborative simultaneous localization and mapping framework for multi-robot systems.IEEE Robotics and Automation Letters9, 1 (2023), 475–482

  23. [23]

    Pierre-Yves Lajoie, Benjamin Ramtoula, Yun Chang, Luca Carlone, and Giovanni Beltrame. 2020. DOOR-SLAM: Distributed, online, and outlier resilient SLAM for robotic teams.IEEE Robotics and Automation Letters5, 2 (2020), 1656–1663

  24. [24]

    Kenneth Levenberg. 1944. A method for the solution of certain non-linear problems in least squares.Quarterly of applied mathematics2, 2 (1944), 164–168

  25. [25]

    Mingrui Li, Shuhong Liu, Heng Zhou, Guohao Zhu, Na Cheng, Tianchen Deng, and Hongyu Wang. 2024. Sgs-slam: Semantic gaussian splatting for neural dense slam. InEuropean Conference on Computer Vision. Springer, 163–179

  26. [26]

    Yonghao Li, Ping Ye, and Qingxuan Jia. 2025. MANG-SLAM: Multi-Agent Neural Submap and Gaussian Representation for Dense Mapping.IEEE Robotics and Automation Letters11, 2 (2025), 2242–2249

  27. [27]

    Lahav Lipson and Jia Deng. 2024. Multi-session slam with differentiable wide- baseline pose optimization. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition. 19626–19635

  28. [28]

    Lorenzo Liso, Erik Sandström, Vladimir Yugay, Luc Van Gool, and Martin R Oswald. 2024. Loopy-slam: Dense neural slam with loop closures. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 20363– 20373

  29. [29]

    Hidenobu Matsuki, Riku Murai, Paul HJ Kelly, and Andrew J Davison. 2024. Gaussian splatting slam. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 18039–18048

  30. [30]

    Riku Murai, Eric Dexheimer, and Andrew J Davison. 2025. Mast3r-slam: Real- time dense slam with 3d reconstruction priors. InProceedings of the Computer Vision and Pattern Recognition Conference. 16695–16705

  31. [31]

    Richard A Newcombe, Steven J Lovegrove, and Andrew J Davison. 2011. DTAM: Dense tracking and mapping in real-time. In2011 international conference on MAGS-SLAM: Monocular Multi-Agent Gaussian Splatting SLAM for Geometrically and Photometrically Consistent Reconstruction computer vision. IEEE, 2320–2327

  32. [32]

    2006.Numerical optimization

    Jorge Nocedal and Stephen J Wright. 2006.Numerical optimization. Springer

  33. [33]

    Xiaqing Pan, Nicholas Charron, Yongqian Yang, Scott Peters, Thomas Whelan, Chen Kong, Omkar Parkhi, Richard Newcombe, and Yuheng Carl Ren. 2023. Aria digital twin: A new benchmark dataset for egocentric 3d machine perception. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 20133– 20143

  34. [34]

    Zhexi Peng, Tianjia Shao, Yong Liu, Jingke Zhou, Yin Yang, Jingdong Wang, and Kun Zhou. 2024. Rtg-slam: Real-time 3d reconstruction at scale using gaussian splatting. InACM SIGGRAPH 2024 conference papers. 1–11

  35. [35]

    Xavier Puig, Eric Undersander, Andrew Szot, Mikael Dallaire Cote, Tsung-Yen Yang, Ruslan Partsey, Ruta Desai, Alexander Clegg, Michal Hlavac, So Yeon Min, et al. 2024. Habitat 3.0: A co-habitat for humans, avatars, and robots. In International Conference on Learning Representations, Vol. 2024. 15306–15336

  36. [36]

    Antoni Rosinol, John J Leonard, and Luca Carlone. 2023. Nerf-slam: Real-time dense monocular slam with neural radiance fields. In2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 3437–3444

  37. [37]

    Erik Sandström, Yue Li, Luc Van Gool, and Martin R Oswald. 2023. Point-slam: Dense neural point cloud-based slam. InProceedings of the IEEE/CVF international conference on computer vision. 18433–18444

  38. [38]

    Erik Sandström, Ganlin Zhang, Keisuke Tateno, Michael Oechsle, Michael Niemeyer, Youmin Zhang, Manthan Patel, Luc Van Gool, Martin Oswald, and Federico Tombari. 2025. Splat-slam: Globally optimized rgb-only slam with 3d gaussians. InProceedings of the Computer Vision and Pattern Recognition Conference. 1680–1691

  39. [39]

    Patrik Schmuck and Margarita Chli. 2019. CCM-SLAM: Robust and efficient centralized collaborative monocular simultaneous localization and mapping for robotic teams.Journal of Field Robotics36, 4 (2019), 763–781

  40. [40]

    Patrik Schmuck, Thomas Ziegler, Marco Karrer, Jonathan Perraudin, and Mar- garita Chli. 2021. Covins: Visual-inertial slam for centralized collaboration.arXiv preprint arXiv:2108.05756(2021)

  41. [41]

    Thomas Schops, Torsten Sattler, and Marc Pollefeys. 2019. Bad slam: Bundle adjusted direct rgb-d slam. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 134–144

  42. [42]

    Julian Straub, Thomas Whelan, Lingni Ma, Yufan Chen, Erik Wijmans, Simon Green, Jakob J Engel, Raul Mur-Artal, Carl Ren, Shobhit Verma, et al. 2019. The replica dataset: A digital replica of indoor spaces.arXiv preprint arXiv:1906.05797 (2019)

  43. [43]

    Edgar Sucar, Shikun Liu, Joseph Ortiz, and Andrew J Davison. 2021. imap: Implicit mapping and positioning in real-time. InProceedings of the IEEE/CVF international conference on computer vision. 6229–6238

  44. [44]

    Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, and Xiaowei Zhou. 2021. LoFTR: Detector-free local feature matching with transformers. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8922–8931

  45. [45]

    Zachary Teed and Jia Deng. 2021. Droid-slam: Deep visual slam for monocular, stereo, and rgb-d cameras.Advances in neural information processing systems34 (2021), 16558–16569

  46. [46]

    Zachary Teed, Lahav Lipson, and Jia Deng. 2023. Deep patch visual odometry. Advances in Neural Information Processing Systems36 (2023), 39033–39051

  47. [47]

    Annika Thomas, Aneesa Sonawalla, Alex Rose, and Jonathan P How. 2025. GRAND-SLAM: Local Optimization for Globally Consistent Large-Scale Multi- Agent Gaussian SLAM.IEEE Robotics and Automation Letters(2025)

  48. [48]

    Yulun Tian, Yun Chang, Fernando Herrera Arias, Carlos Nieto-Granda, Jonathan P How, and Luca Carlone. 2022. Kimera-multi: Robust, distributed, dense metric-semantic slam for multi-robot systems.IEEE transactions on robotics 38, 4 (2022)

  49. [49]

    Shinji Umeyama. 1991. Least-squares estimation of transformation parameters between two point patterns.IEEE Transactions on pattern analysis and machine intelligence13, 4 (1991), 376–380

  50. [50]

    Hengyi Wang, Jingwen Wang, and Lourdes Agapito. 2023. Co-slam: Joint coordi- nate and sparse parametric encodings for neural real-time slam. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 13293–13302

  51. [51]

    Xiaohao Xu, Feng Xue, Shibo Zhao, Yike Pan, Sebastian Scherer, and Xiaonan Huang. 2025. Mac-ego3d: Multi-agent gaussian consensus for real-time collab- orative ego-motion and photorealistic 3d reconstruction. InProceedings of the Computer Vision and Pattern Recognition Conference. 854–863

  52. [52]

    Chi Yan, Delin Qu, Dan Xu, Bin Zhao, Zhigang Wang, Dong Wang, and Xuelong Li. 2024. Gs-slam: Dense visual slam with 3d gaussian splatting. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 19595–19604

  53. [53]

    Xingrui Yang, Hai Li, Hongjia Zhai, Yuhang Ming, Yuqian Liu, and Guofeng Zhang. 2022. Vox-fusion: Dense tracking and mapping with voxel-based neural implicit representation. In2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 499–507

  54. [54]

    Javier Yu, Timothy Chen, and Mac Schwager. 2025. Hammer: heterogeneous, multi-robot semantic gaussian splatting.IEEE Robotics and Automation Letters (2025)

  55. [55]

    Vladimir Yugay, Theo Gevers, and Martin R Oswald. 2025. Magic-slam: Multi- agent gaussian globally consistent slam. InProceedings of the Computer Vision and Pattern Recognition Conference. 6741–6750

  56. [56]

    Vladimir Yugay, Yue Li, Theo Gevers, and Martin R Oswald. 2023. Gaussian- slam: Photo-realistic dense slam with gaussian splatting.arXiv preprint arXiv:2312.10070(2023)

  57. [57]

    Wei Zhang, Qing Cheng, David Skuddis, Niclas Zeller, Daniel Cremers, and Nor- bert Haala. 2025. Hi-slam2: Geometry-aware gaussian slam for fast monocular scene reconstruction.IEEE Transactions on Robotics41 (2025), 6478–6493

  58. [58]

    Wei Zhang, Tiecheng Sun, Sen Wang, Qing Cheng, and Norbert Haala. 2023. Hi-slam: Monocular real-time dense mapping with hybrid implicit fields.IEEE Robotics and Automation Letters9, 2 (2023), 1548–1555

  59. [59]

    Youmin Zhang, Fabio Tosi, Stefano Mattoccia, and Matteo Poggi. 2023. Go-slam: Global optimization for consistent 3d instant reconstruction. InProceedings of the IEEE/CVF international conference on computer vision. 3727–3737

  60. [60]

    Yuchen Zhou and Haihang Wu. 2025. Multi-Agent Monocular Dense SLAM With 3D Reconstruction Priors.arXiv preprint arXiv:2511.19031(2025)

  61. [61]

    Liyuan Zhu, Yue Li, Erik Sandström, Shengyu Huang, Konrad Schindler, and Iro Armeni. 2025. Loopsplat: Loop closure by registering 3d gaussian splats. In2025 International Conference on 3D Vision (3DV). IEEE, 156–167

  62. [62]

    Zihan Zhu, Songyou Peng, Viktor Larsson, Zhaopeng Cui, Martin R Oswald, Andreas Geiger, and Marc Pollefeys. 2024. Nicer-slam: Neural implicit scene encoding for rgb slam. In2024 International Conference on 3D Vision (3DV). IEEE, 42–52

  63. [63]

    Zihan Zhu, Songyou Peng, Viktor Larsson, Weiwei Xu, Hujun Bao, Zhaopeng Cui, Martin R Oswald, and Marc Pollefeys. 2022. Nice-slam: Neural implicit scalable encoding for slam. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12786–12796

  64. [64]

    Zihan Zhu, Wei Zhang, Moyang Li, Norbert Haala, Marc Pollefeys, and Daniel Barath. 2025. Vigs-slam: visual inertial Gaussian splatting slam.arXiv preprint arXiv:2512.02293(2025). A The ReplicaMultiagent Plus Benchmark Existing multi-agent SLAM benchmarks force a trade-off between photometric realism, agent count, and ground-truth completeness. Real-world ...