pith. machine review for the scientific record. sign in

arxiv: 2603.22650 · v2 · submitted 2026-03-23 · 💻 cs.CV · cs.RO

Recognition: unknown

MAGICIAN: Efficient Long-Term Planning with Imagined Gaussians for Active Mapping

Authors on Pith no claims yet

Pith reviewed 2026-05-15 00:05 UTC · model grok-4.3

classification 💻 cs.CV cs.RO
keywords active mappinglong-term planning3D Gaussian Splattingoccupancy networknext-best-viewvolumetric renderingtree searchsurface coverage
0
0 comments X

The pith

Imagined Gaussians from a pre-trained occupancy network let agents plan long-horizon trajectories that maximize total surface coverage in active mapping.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Active mapping requires an agent to choose movements that reconstruct unknown spaces with minimal effort. Most prior methods pick the single next best view greedily, which often produces inefficient paths and incomplete maps. MAGICIAN instead builds a temporary scene model called Imagined Gaussians by combining a pre-trained occupancy network with 3D Gaussian Splatting. This model supports fast volumetric rendering to estimate coverage gain from any future viewpoint, which is then used inside a tree-search planner to select sequences of actions that accumulate more coverage over many steps. The representation is refreshed in a closed loop as new observations arrive, and the approach reports state-of-the-art coverage on both indoor and outdoor test environments.

Core claim

The paper claims that a scene representation formed by Imagined Gaussians, obtained by lifting a pre-trained occupancy network into 3D Gaussian Splatting, permits rapid volumetric rendering of coverage gain for arbitrary novel viewpoints; this quantity can be maximized over long horizons by a tree-search algorithm whose trajectories are refined in closed loop with the evolving Gaussian model, yielding higher accumulated surface coverage than greedy next-best-view selection.

What carries the argument

Imagined Gaussians, a temporary 3D Gaussian Splatting representation derived from a pre-trained occupancy network that encodes structural priors and supports fast volumetric rendering of surface coverage gain for any candidate viewpoint.

If this is right

  • Long-horizon tree search replaces single-step greedy selection, directly increasing total surface coverage accumulated along the trajectory.
  • Fast volumetric rendering inside the Gaussian model allows many candidate future views to be evaluated without expensive ray casting or mesh extraction.
  • Closed-loop updates keep the Imagined Gaussians aligned with incoming sensor data, enabling the planner to react to newly discovered structure.
  • The same representation works across indoor and outdoor scenes and across different discrete or continuous action spaces.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could be combined with learned motion priors to prune the search tree further and reduce planning time in large environments.
  • If the occupancy network generalizes across robot platforms, the same Gaussian model might support active mapping on aerial, ground, or underwater agents without retraining.
  • Coverage-gain estimates derived from Gaussians could serve as a drop-in reward signal for reinforcement-learning policies trained in simulation.

Load-bearing premise

The pre-trained occupancy network supplies structural priors strong enough that the derived Imagined Gaussians accurately forecast coverage gains for novel viewpoints in environments never seen during training.

What would settle it

Deploy both MAGICIAN and a standard greedy next-best-view planner on the same unseen real-world scene for a fixed budget of actions and measure final surface coverage; if the long-horizon planner does not produce strictly higher coverage, the claimed advantage of the imagined-Gaussian planning loop is falsified.

Figures

Figures reproduced from arXiv: 2603.22650 by Antoine Gu\'edon, Shiyao Li, Shizhe Chen, Vincent Lepetit.

Figure 1
Figure 1. Figure 1: MAGICIAN enables efficient, high-coverage exploration across diverse environments. We visualize the exploration tra￾jectories (light-to-dark gradients) generated by our method and the resulting 3D reconstructions (surface meshes and textures) for various outdoor and indoor scenes. MAGICIAN is powered by what we call “Imagined Gaussians”, predicted by our occupancy model to model scene uncertainty, making e… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed MAGICIAN framework. At time t, we first predict the occupancy field using the occupancy model and update the Imagined Gaussians. We can then efficiently estimate the coverage gain and apply beam search to plan Nb candidate trajectories, selecting the one with the highest expected gain. The agent then executes the first Nf actions of the best trajectory τk of length Nd before repeat… view at source ↗
Figure 3
Figure 3. Figure 3: Computing coverage gain with Imagined Gaussians. During beam search, we evaluate candidate poses by rendering novelty maps from the Imagined Gaussians to compute the cover￾age gain. The corresponding depth maps are then used to update the novelty γˆ of Gaussians within a depth tolerance ϵd. o(x, c) equals 0 if point x is occluded from camera ct and 1 otherwise; and γ(x|Ct) ∈ {0, 1}, referred to as the nove… view at source ↗
Figure 4
Figure 4. Figure 4: Evolution of Imagined Gaussians Compared with Ground Truth Mesh. The brighter the Gaussians, the higher their pre￾dicted occupancy. As exploration progresses (from left to right), our Imagined Gaussians increasingly align with the ground truth mesh, demonstrating improved environmental modeling. Carlo sampling, leverages GPU-accelerated volumetric ren￾dering, and requires only a single occupancy network, l… view at source ↗
Figure 5
Figure 5. Figure 5: 3D reconstructions obtained with our trajectories. We show Gaussian splatting renderings (top row) and normal maps of the reconstructed meshes (bottom row) after applying Mesh-In-the-Loop Gaussian Splatting [20] on 100 RGB images collected along our trajectories. The trajectories output by our method cover the entire scene surfaces, resulting in complete and accurate surface meshes [PITH_FULL_IMAGE:figure… view at source ↗
Figure 6
Figure 6. Figure 6: Qualitative comparison of novel view synthesis (top row) and surface reconstruction (bottom row) in outdoor and indoor scenes. For each method, we show RGB Gaussian splatting renderings and normal maps of reconstructed meshes after applying Mesh-In￾the-Loop Gaussian Splatting [20] on 100 images collected along the trajectory. The trajectories computed with our method produce more accurate and complete reco… view at source ↗
Figure 8
Figure 8. Figure 8: Ablation study on replanning frequency. The hori￾zontal axis indicates the number Nf of movement steps executed after each planning phase before replanning. Replanning every 6 steps already provides state-of-the-art results [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
Figure 7
Figure 7. Figure 7: Ablation study on the beam search parameters. The horizontal axis denotes the beam width Nb, and the vertical axis represents the look-ahead steps Nd. Five steps correspond roughly to half the size of the scene. improve as the number of beams Nb and the number of look-ahead steps Nd increase, demonstrating the effective￾ness of the proposed beam search strategy. Increasing the number of beams or look-ahead… view at source ↗
Figure 9
Figure 9. Figure 9: demonstrate that our method exhibits consistently low values in this metric, indicating that its performance is highly robust: despite different random initial poses, it re￾liably achieves high final coverage. In contrast, the other methods exhibit substantially larger variance, suggesting that their performance is highly sensitive to the initial pose MACARONS FisherRF Ours 0.00 0.02 0.04 0.06 0.08 0.10 St… view at source ↗
Figure 10
Figure 10. Figure 10: Visualization of exploration trajectories and qualitative comparisons of novel view synthesis and surface reconstruction in outdoor scenes. From top to bottom, the scenes are Pisa Cathedral and Neuschwanstein Castle. In the same scene, all methods start from the same initial camera pose, and for each trajectory visualization, we additionally show the final camera pose at the end of the trajectory. Our tra… view at source ↗
Figure 11
Figure 11. Figure 11: Visualization of exploration trajectories and qualitative comparisons of novel view synthesis and surface reconstruction in indoor scenes. From top to bottom, the scenes are St. Sofia Church and Barts. In the same scene, all methods start from the same initial camera pose, and for each trajectory visualization, we additionally show the final camera pose at the end of the trajectory. Our trajectory plannin… view at source ↗
read the original abstract

Active mapping aims to determine how an agent should move to efficiently reconstruct unknown environments. Most existing approaches rely on greedy next-best-view prediction, resulting in inefficient exploration and incomplete reconstruction. To address this, we introduce MAGICIAN, a novel long-term planning framework that maximizes accumulated surface coverage gain through Imagined Gaussians, a scene representation based on 3D Gaussian Splatting, derived from a pre-trained occupancy network with strong structural priors. This representation enables efficient coverage gain computation for any novel viewpoint via fast volumetric rendering, allowing its integration into a tree-search algorithm for long-horizon planning. We update Imagined Gaussians and refine the trajectory in a closed loop. Our method achieves state-of-the-art performance across indoor and outdoor benchmarks with varying action spaces, highlighting the advantage of long-term planning in active mapping.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces MAGICIAN, a long-term planning framework for active mapping that derives Imagined Gaussians from a pre-trained occupancy network to enable fast volumetric rendering of coverage gains for novel viewpoints, which are then maximized via tree-search planning in a closed-loop update scheme, claiming state-of-the-art performance on indoor and outdoor benchmarks with varying action spaces.

Significance. If the Imagined Gaussians yield accurate coverage predictions that transfer to unseen scenes, the approach would meaningfully advance active mapping by replacing greedy next-best-view selection with efficient long-horizon planning, potentially improving reconstruction completeness in robotics applications.

major comments (2)
  1. [§3.2] §3.2 (Imagined Gaussians derivation): The coverage-gain computation via fast rendering assumes that the pre-trained occupancy network supplies sufficiently accurate structural priors for novel viewpoints and environments, yet no quantitative prediction-error metrics or cross-scene transfer experiments are reported; this assumption is load-bearing for the tree-search planner's claimed advantage.
  2. [Experiments] Experiments section, Table 1 (SOTA comparison): The reported performance gains over greedy baselines are presented without ablations isolating the contribution of long-horizon planning versus the representation itself, nor any validation that the planner optimizes against faithful rather than erroneous coverage estimates.
minor comments (2)
  1. [Abstract] The abstract and §2 could more explicitly define the closed-loop update procedure and the precise formulation of accumulated coverage gain.
  2. [§3] Notation for the Gaussian parameters and rendering integral is introduced without a consolidated table, which would aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. The comments highlight important aspects of validation that will strengthen the manuscript. We address each point below and will incorporate the suggested additions in the revised version.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (Imagined Gaussians derivation): The coverage-gain computation via fast rendering assumes that the pre-trained occupancy network supplies sufficiently accurate structural priors for novel viewpoints and environments, yet no quantitative prediction-error metrics or cross-scene transfer experiments are reported; this assumption is load-bearing for the tree-search planner's claimed advantage.

    Authors: We agree that direct validation of the occupancy network's structural priors is essential to support the coverage-gain computation and the planner's advantage. While the SOTA results on diverse indoor and outdoor benchmarks provide indirect evidence of effective generalization, the original manuscript did not report quantitative prediction-error metrics or dedicated cross-scene transfer experiments. In the revised version, we will add these: occupancy prediction errors (e.g., IoU and MSE on novel viewpoints) and cross-scene transfer results to explicitly demonstrate the accuracy and transferability of the priors. This will directly address the load-bearing assumption. revision: yes

  2. Referee: [Experiments] Experiments section, Table 1 (SOTA comparison): The reported performance gains over greedy baselines are presented without ablations isolating the contribution of long-horizon planning versus the representation itself, nor any validation that the planner optimizes against faithful rather than erroneous coverage estimates.

    Authors: We appreciate the call for clearer isolation of contributions. The original experiments emphasized end-to-end SOTA comparisons but did not include targeted ablations separating the long-horizon planner from the Imagined Gaussians representation, nor direct validation against ground-truth coverage. In the revision, we will add ablations that (1) compare tree-search long-horizon planning against greedy selection with the same representation and (2) evaluate planner performance using our coverage estimates versus oracle ground-truth estimates. These will confirm the source of the gains and the faithfulness of the estimates. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation relies on external pre-trained priors and standard rendering/planning without self-referential reduction

full rationale

The abstract and description present Imagined Gaussians as derived from an external pre-trained occupancy network, with coverage gains computed via volumetric rendering and integrated into tree-search planning. No equations, predictions, or claims reduce by construction to fitted inputs, self-citations, or ansatzes from the same authors. The SOTA performance is benchmark-driven rather than tautological, and the method chain remains self-contained against external priors without load-bearing internal loops.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that a pre-trained occupancy network supplies reliable structural priors for generating Imagined Gaussians that generalize to novel scenes; no free parameters or invented entities are explicitly quantified in the abstract.

axioms (1)
  • domain assumption Pre-trained occupancy networks encode strong structural priors that transfer to unseen environments for coverage prediction.
    Invoked to justify deriving Imagined Gaussians from the network for accurate novel-view coverage computation.
invented entities (1)
  • Imagined Gaussians no independent evidence
    purpose: Scene representation enabling fast volumetric rendering of coverage gain for any viewpoint in long-term planning.
    New representation introduced in the paper, derived from occupancy network; no independent evidence provided in abstract.

pith-pipeline@v0.9.0 · 5444 in / 1363 out tokens · 32215 ms · 2026-05-15T00:05:04.053434+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 1 internal anchor

  1. [1]

    Information-Theoretic Exploration with Bayesian Optimiza- tion

    Shi Bai, Jinkun Wang, Fanfei Chen, and Brendan Englot. Information-Theoretic Exploration with Bayesian Optimiza- tion. InInternational Conference on Intelligent Robots and Systems, pages 1816–1822, 2016. 2

  2. [2]

    Banta, L

    Joseph E. Banta, L. R. Wong, Christophe Dumont, and Mongi A. Abidi. A Next-Best-View System for Autonomous 3D Object Reconstruction.IEEE Transactions on Systems, Man, and Cybernetics, 30(5):589–598, 2000. 2

  3. [3]

    A Multi-Resolution Frontier-Based Planner for Autonomous 3D Exploration.IEEE Robotics and Automation Letters, 6(3):4528–4535, 2021

    Ana Batinovic, Tamara Petrovic, Antun Ivanovic, Frano Pet- ric, and Stjepan Bogdan. A Multi-Resolution Frontier-Based Planner for Autonomous 3D Exploration.IEEE Robotics and Automation Letters, 6(3):4528–4535, 2021. 2

  4. [4]

    Receding Horizon” Next- Best-View” Planner for 3D Exploration

    Andreas Bircher, Mina Kamel, Kostas Alexis, Helen Oleynikova, and Roland Siegwart. Receding Horizon” Next- Best-View” Planner for 3D Exploration. InInternational Conference on Robotics and Automation, pages 1462–1468,

  5. [5]

    Makarenko, Stefan B

    Frederic Bourgault, Alexei A. Makarenko, Stefan B. Williams, Ben Grocholsky, and Hugh F. Durrant-Whyte. In- formation Based Adaptive Robotic Exploration. InInterna- tional Conference on Intelligent Robots and Systems, pages 540–545, 2002. 2

  6. [6]

    Hi- erarchical coverage path planning in complex 3d environ- ments

    Chao Cao, Ji Zhang, Matt Travers, and Howie Choset. Hi- erarchical coverage path planning in complex 3d environ- ments. In2020 IEEE International Conference on Robotics and Automation (ICRA), pages 3206–3212. IEEE, 2020. 2

  7. [7]

    TARE: A Hierarchical Framework for Efficiently Exploring Complex 3D Environments.Robotics: Science and Systems, 5:2, 2021

    Chao Cao, Hongbiao Zhu, Howie Choset, and Ji Zhang. TARE: A Hierarchical Framework for Efficiently Exploring Complex 3D Environments.Robotics: Science and Systems, 5:2, 2021. 2

  8. [8]

    Matterport3D: Learning from RGB- D Data in Indoor Environments

    Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Hal- ber, Matthias Niessner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3D: Learning from RGB- D Data in Indoor Environments. InarXiv Preprint, 2017. 2, 5

  9. [9]

    Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Mano- lis Savva, Shuran Song, Hao Su, et al

    Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Mano- lis Savva, Shuran Song, Hao Su, et al. ShapeNet: An Information-Rich 3D Model Repository. InarXiv Preprint,

  10. [10]

    ActiveGamer: Active Gaussian Mapping through Efficient Rendering

    Liyan Chen, Huangying Zhan, Kevin Chen, Xiangyu Xu, Qingan Yan, Changjiang Cai, and Yi Xu. ActiveGamer: Active Gaussian Mapping through Efficient Rendering. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 16486–16497, 2025. 2, 3, 6, 8

  11. [11]

    Understanding while explor- ing: Semantics-driven active mapping.arXiv preprint arXiv:2506.00225, 2025

    Liyan Chen, Huangying Zhan, Hairong Yin, Yi Xu, and Philippos Mordohai. Understanding while explor- ing: Semantics-driven active mapping.arXiv preprint arXiv:2506.00225, 2025. 8

  12. [12]

    GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D In- door Scenes

    Xiao Chen, Tai Wang, Quanyi Li, Tao Huang, Jiangmiao Pang, and Tianfan Xue. GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D In- door Scenes. InarXiv Preprint, 2025. 2, 3

  13. [13]

    Fast Frontier-Based Information-Driven Autonomous Exploration with an Mav

    Anna Dai, Sotiris Papatheodorou, Nils Funk, Dimos Tzoumanikas, and Stefan Leutenegger. Fast Frontier-Based Information-Driven Autonomous Exploration with an Mav. InInternational Conference on Robotics and Automation, pages 9570–9576, 2020. 2

  14. [14]

    Davison, Ian D

    Andrew J. Davison, Ian D. Reid, Nicholas D. Molton, and Olivier Stasse. MonoSLAM: Real-Time Single Camera SLAM.IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6):1052–1067, 2007. 2

  15. [15]

    NARUTO: Neural Active Reconstruction from Uncer- tain Target Observations

    Ziyue Feng, Huangying Zhan, Zheng Chen, Qingan Yan, Xiangyu Xu, Changjiang Cai, Bing Li, Qilun Zhu, and Yi Xu. NARUTO: Neural Active Reconstruction from Uncer- tain Target Observations. InConference on Computer Vision and Pattern Recognition, pages 21572–21583, 2024. 2, 3, 6, 8

  16. [16]

    Uncertainty-Driven Planner for Exploration and Navigation

    Georgios Georgakis, Bernadette Bucher, Anton Arapin, Karl Schmeckpeper, Nikolai Matni, and Kostas Daniilidis. Uncertainty-Driven Planner for Exploration and Navigation. InInternational Conference on Robotics and Automation, pages 11295–11302, 2022. 6, 8

  17. [17]

    Sugar: Surface- Aligned Gaussian Splatting for Efficient 3D Mesh Recon- struction and High-Quality Mesh Rendering

    Antoine Gu ´edon and Vincent Lepetit. Sugar: Surface- Aligned Gaussian Splatting for Efficient 3D Mesh Recon- struction and High-Quality Mesh Rendering. InConference on Computer Vision and Pattern Recognition, pages 5354– 5363, 2024. 3

  18. [18]

    SCONE: Surface Coverage Optimization In Unknown Envi- ronments by V olumetric Integration

    Antoine Gu ´edon, Pascal Monasse, and Vincent Lepetit. SCONE: Surface Coverage Optimization In Unknown Envi- ronments by V olumetric Integration. InAdvances in Neural Information Processing Systems, page NIPS, 2022. 2, 3, 4, 5, 6, 7, 1

  19. [19]

    MACARONS: Mapping And Coverage Antici- pation with RGB Online Self-Supervision

    Antoine Gu ´edon, Tom Monnier, Pascal Monasse, and Vin- cent Lepetit. MACARONS: Mapping And Coverage Antici- pation with RGB Online Self-Supervision. InConference on Computer Vision and Pattern Recognition, pages 940–951,

  20. [20]

    MILo: Mesh-In-the-Loop Gaussian Splatting for Detailed and Ef- ficient Surface Reconstruction

    Antoine Gu ´edon, Diego Gomez, Nissim Maruani, Bingchen Gong, George Drettakis, and Maks Ovsjanikov. MILo: Mesh-In-the-Loop Gaussian Splatting for Detailed and Ef- ficient Surface Reconstruction. InarXiv Preprint, pages arXiv–2506, 2025. 6, 7

  21. [21]

    Next-Best-View Plan- ning for Surface Reconstruction of Large-Scale 3D Environ- ments with Multiple UA Vs

    Guillaume Hardouin, Julien Moras, Fabio Morbidi, Julien Marzat, and El Mustapha Mouaddib. Next-Best-View Plan- ning for Surface Reconstruction of Large-Scale 3D Environ- ments with Multiple UA Vs. InInternational Conference on Intelligent Robots and Systems, pages 1567–1574, 2020. 2

  22. [22]

    A formal basis for the heuristic determination of minimum cost paths

    Peter E Hart, Nils J Nilsson, and Bertram Raphael. A formal basis for the heuristic determination of minimum cost paths. IEEE transactions on Systems Science and Cybernetics, 4 (2):100–107, 1968. 2

  23. [23]

    Efficient Visual Exploration and Coverage with a Micro Aerial Vehicle in Unknown Environments

    Lionel Heng, Alkis Gotovos, Andreas Krause, and Marc Pollefeys. Efficient Visual Exploration and Coverage with a Micro Aerial Vehicle in Unknown Environments. InIn- ternational Conference on Robotics and Automation, pages 1071–1078, 2015. 2

  24. [24]

    2D Gaussian Splatting for Geometrically Ac- curate Radiance Fields

    Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2D Gaussian Splatting for Geometrically Ac- curate Radiance Fields. InACM SIGGRAPH, pages 1–11,

  25. [25]

    An Information Gain Formulation for Active V olumetric 3D Reconstruction

    Stefan Isler, Reza Sabzevari, Jeffrey Delmerico, and Davide Scaramuzza. An Information Gain Formulation for Active V olumetric 3D Reconstruction. InInternational Conference on Robotics and Automation, pages 3477–3484, 2016. 2

  26. [26]

    FisherRF: Ac- tive View Selection and Mapping with Radiance Fields Us- ing Fisher Information

    Wen Jiang, Boshu Lei, and Kostas Daniilidis. FisherRF: Ac- tive View Selection and Mapping with Radiance Fields Us- ing Fisher Information. InEuropean Conference on Com- puter Vision, pages 422–440, 2024. 2, 3, 6, 7, 4, 5

  27. [27]

    Activegs: Active Scene Re- construction Using Gaussian Splatting.IEEE Robotics and Automation Letters, 2025

    Liren Jin, Xingguang Zhong, Yue Pan, Jens Behley, Cyrill Stachniss, and Marija Popovi ´c. Activegs: Active Scene Re- construction Using Gaussian Splatting.IEEE Robotics and Automation Letters, 2025. 2, 3

  28. [28]

    3D Gaussian Splatting for Real-Time Radiance Field Rendering.ACM Transactions on Graphics, 42(4):1–14, 2023

    Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, and George Drettakis. 3D Gaussian Splatting for Real-Time Radiance Field Rendering.ACM Transactions on Graphics, 42(4):1–14, 2023. 3

  29. [29]

    Next-Best-Scan Planning for Autonomous 3D Modeling

    Simon Kriegel, Christian Rink, Tim Bodenm¨uller, Alexander Narr, Michael Suppa, and Gerd Hirzinger. Next-Best-Scan Planning for Autonomous 3D Modeling. InInternational Conference on Intelligent Robots and Systems, pages 2850– 2856, 2012. 2

  30. [30]

    Rapidly-exploring random trees: A new tool for path planning.Research Report 9811, 1998

    Steven LaValle. Rapidly-exploring random trees: A new tool for path planning.Research Report 9811, 1998. 2

  31. [31]

    Uncertainty Guided Pol- icy for Active Robotic 3D Reconstruction Using Neural Ra- diance Fields.IEEE Robotics and Automation Letters, 7(4): 12070–12077, 2022

    Soomin Lee, Le Chen, Jiahao Wang, Alexander Liniger, Suryansh Kumar, and Fisher Yu. Uncertainty Guided Pol- icy for Active Robotic 3D Reconstruction Using Neural Ra- diance Fields.IEEE Robotics and Automation Letters, 7(4): 12070–12077, 2022. 2

  32. [32]

    NextBestPath: Efficient 3D Map- ping of Unseen Environments

    Shiyao Li, Antoine Guedon, Cl ´ementin Boittiaux, Shizhe Chen, and Vincent Lepetit. NextBestPath: Efficient 3D Map- ping of Unseen Environments. InInternational Conference on Learning Representations, 2025. 2, 3, 6, 8

  33. [33]

    Activesplat: High-fidelity scene reconstruction through active gaussian splatting.IEEE Robotics and Automation Letters, 2025

    Yuetao Li, Zijia Kuang, Ting Li, Qun Hao, Zike Yan, Guyue Zhou, and Shaohui Zhang. Activesplat: High-fidelity scene reconstruction through active gaussian splatting.IEEE Robotics and Automation Letters, 2025. 2, 3

  34. [34]

    Supervised Learning of the Next-Best-View for 3D Object Reconstruction.Pat- tern Recognition Letters, 2020

    Miguel Mendoza, Juan Irving Vasquez-Gomez, Hind Taud, Luis Enrique Sucar, and Carolina Reta. Supervised Learning of the Next-Best-View for 3D Object Reconstruction.Pat- tern Recognition Letters, 2020. 2, 3

  35. [35]

    NeRF: Representing Scenes As Neural Radiance Fields for View Synthesis

    Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. NeRF: Representing Scenes As Neural Radiance Fields for View Synthesis. InEuropean Conference on Computer Vision, pages 405–421. Springer, 2020. 3, 4

  36. [36]

    ORB-SLAM: A versatile and accurate monocular SLAM system.IEEE Transactions on Robotics, 31(5):1147–1163, 2015

    Raul Mur-Artal and Juan D Tardos. ORB-SLAM: A versatile and accurate monocular SLAM system.IEEE Transactions on Robotics, 31(5):1147–1163, 2015. 2

  37. [37]

    Activen- erf: Learning Where to See with Uncertainty Estimation

    Xuran Pan, Zihang Lai, Shiji Song, and Gao Huang. Activen- erf: Learning Where to See with Uncertainty Estimation. In European Conference on Computer Vision, pages 230–246,

  38. [38]

    Ramakrishnan, Ziad Al-Halah, and Kristen Grauman

    Santhosh K. Ramakrishnan, Ziad Al-Halah, and Kristen Grauman. Occupancy Anticipation for Efficient Exploration and Navigation. InEuropean Conference on Computer Vi- sion, pages 400–418, 2020. 6, 8

  39. [39]

    Accelerating 3D Deep Learning with PyTorch3D

    Nikhila Ravi, Jeremy Reizenstein, David Novotny, Tay- lor Gordon, Wan-Yen Lo, Justin Johnson, and Georgia Gkioxari. Accelerating 3d deep learning with pytorch3d. arXiv preprint arXiv:2007.08501, 2020. 2

  40. [40]

    Submodular Trajectory Optimization for Aerial 3D Scan- ning

    Mike Roberts, Debadeepta Dey, Anh Truong, Sudipta Sinha, Shital Shah, Ashish Kapoor, Pat Hanrahan, and Neel Joshi. Submodular Trajectory Optimization for Aerial 3D Scan- ning. InInternational Conference on Computer Vision, pages 5324–5333, 2017. 2

  41. [41]

    Information Gain-Based Exploration Using Rao- Blackwellized Particle Filters.Robotics: Science and sys- tems, 2(1):65–72, 2005

    Cyrill Stachniss, Giorgio Grisetti, and Wolfram Bur- gard. Information Gain-Based Exploration Using Rao- Blackwellized Particle Filters.Robotics: Science and sys- tems, 2(1):65–72, 2005. 2

  42. [42]

    Vggt: Vi- sual geometry grounded transformer

    Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. Vggt: Vi- sual geometry grounded transformer. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 5294–5306, 2025. 8

  43. [43]

    Dust3r: Geometric 3d vi- sion made easy

    Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. Dust3r: Geometric 3d vi- sion made easy. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20697– 20709, 2024. 8

  44. [44]

    A Frontier-Based Approach for Au- tonomous Exploration

    Brian Yamauchi. A Frontier-Based Approach for Au- tonomous Exploration. InProceedings 1997 IEEE In- ternational Symposium on Computational Intelligence in Robotics and Automation CIRA’97. ’Towards New Computa- tional Principles for Robotics and Automation’, pages 146– 151, 1997. 2, 8

  45. [45]

    Active Neural Mapping

    Zike Yan, Haoxiang Yang, and Hongbin Zha. Active Neural Mapping. InInternational Conference on Computer Vision, pages 10981–10992, 2023. 2, 3, 6, 8

  46. [46]

    PC-NBV: A Point Cloud Based Deep Network for Efficient Next Best View Planning

    Rui Zeng, Wang Zhao, and Yong-Jin Liu. PC-NBV: A Point Cloud Based Deep Network for Efficient Next Best View Planning. InInternational Conference on Intelligent Robots and Systems, 2020. 2, 3

  47. [47]

    RaDe-GS: Rasterizing Depth in Gaussian Splatting.arXiv Preprint, 2024

    Baowen Zhang, Chuan Fang, Rakesh Shrestha, Yixun Liang, Xiaoxiao Long, and Ping Tan. RaDe-GS: Rasterizing Depth in Gaussian Splatting.arXiv Preprint, 2024. 6

  48. [48]

    Fuel: Fast uav exploration using incremental frontier struc- ture and hierarchical planning.IEEE Robotics and Automa- tion Letters, 6(2):779–786, 2021

    Boyu Zhou, Yichen Zhang, Xinyi Chen, and Shaojie Shen. Fuel: Fast uav exploration using incremental frontier struc- ture and hierarchical planning.IEEE Robotics and Automa- tion Letters, 6(2):779–786, 2021. 3 Appendix In Sec. A, we present the details of the occupancy module, and the complete formulation of the coverage gain com- putation along with the a...