arxiv: 2605.05095 · v1 · submitted 2026-05-06 · 💻 cs.GR · cs.CV· cs.LG· stat.ML

Recognition: unknown

A Bayesian Approach for Task-Specific Next-Best-View Selection with Uncertain Geometry

Jingsen Zhu , Silvia Sell\'an , Alexander Terenin

Authors on Pith no claims yet

Pith reviewed 2026-05-08 16:10 UTC · model grok-4.3

classification 💻 cs.GR cs.CVcs.LGstat.ML

keywords next-best-view selectionBayesian decision theory3D reconstructionactive sensingtask-specific planninguncertain geometrystochastic surface reconstruction

0 comments

The pith

A Bayesian framework selects next camera views to optimize performance on a specific downstream task rather than reduce uncertainty uniformly across the 3D surface.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method for active next-best-view selection during 3D reconstruction from point clouds. It places a prior distribution over possible implicit surfaces, updates it to a posterior using stochastic surface reconstruction, and then applies Bayesian decision theory to choose the next camera pose that maximizes expected improvement on the intended task. This differs from prior approaches by focusing scanning effort only on uncertainty that affects task performance, such as semantic classification, part segmentation, or PDE-based physics simulation. Experiments show the method reaches higher task accuracy using fewer total views than standard baselines and general uncertainty-reduction techniques.

Core claim

By framing camera selection as a Bayesian decision problem, the framework computes the expected task utility of each candidate view under the posterior distribution over implicit surfaces, enabling direct optimization for the downstream task instead of generic uncertainty reduction across the entire geometry.

What carries the argument

The posterior distribution over implicit surfaces obtained from stochastic surface reconstruction, which is used inside the Bayesian decision rule to evaluate expected task performance for each possible next view.

If this is right

Scanning effort concentrates on regions that affect the task rather than being spread evenly.
Fewer total views suffice to reach a given level of task performance.
The same reconstruction pipeline supports multiple tasks without modification to the surface model itself.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be extended to sequential decision making over multiple future views by computing multi-step expected utility.
In robotics settings it may reduce total data collection time when the downstream task is known in advance.
It raises the possibility of jointly learning the prior and the task utility function from data for specific object classes.

Load-bearing premise

That the posterior obtained from stochastic surface reconstruction accurately captures the uncertainty relevant to the task and that the Bayesian rule correctly identifies the view maximizing task performance.

What would settle it

An experiment on one of the three tasks where a view selected by the Bayesian method produces lower task accuracy than a view selected by uniform uncertainty reduction, despite using the same number of total scans.

Figures

Figures reproduced from arXiv: 2605.05095 by Alexander Terenin, Jingsen Zhu, Silvia Sell\'an.

**Figure 1.** Figure 1: We introduce a novel framework for optimizing the next view angle given an incomplete observation of a given object. Our algorithm is view at source ↗

**Figure 2.** Figure 2: Using existing stochastic surface reconstruction techniques, we process an input point cloud (left) and compute a posterior distribution, from which we view at source ↗

**Figure 3.** Figure 3: Qualitative comparison with FPS and uncertainty reduction in the classification task of the synthetic pyramid and the Truck-ModelNet dataset. view at source ↗

**Figure 4.** Figure 4: Qualitative comparison with FPS and uncertainty reduction in the coldest point discovery task with heat diffusion simulation. The reference coldest view at source ↗

**Figure 6.** Figure 6: Demonstration of our synthetic pyramid and Truck-ModelNet10 view at source ↗

**Figure 5.** Figure 5: Quantitative comparison on the classification of the ModelNet10 view at source ↗

**Figure 7.** Figure 7: Comparison of the stable hit time 𝑇stable in the synthetic pyramid and Truck-ModelNet10 datasets, including between our discrete candidate search and multi-start gradient-based optimization strategies. the cumulative percentile of scenes achieving first correct and stable hits within 1–5 steps, shown in view at source ↗

**Figure 8.** Figure 8: Qualitative comparison with FPS and uncertainty reduction in the segmentation task of ShapeNet dataset. The viewpoint selected by the current step view at source ↗

**Figure 9.** Figure 9: Ablation study between our method with task-specific acquisition view at source ↗

**Figure 10.** Figure 10: Comparison on the classification of the OmniObject3D dataset. view at source ↗

read the original abstract

We develop a framework for task-specific active next-best-view selection in 3D reconstruction from point clouds, by casting the problem in the language of Bayesian decision theory. Our framework works by (a) placing a prior distribution over the space of implicit surfaces, (b) using recently-developed stochastic surface reconstruction methods to calculate the resulting posterior distribution, then (c) using the posterior distribution to carefully reason about which view to scan next. This enables us to perform camera selection in a manner that is directly optimized for the intended use of the reconstructed data - meaning, we reduce uncertainty only in those regions that make a difference in the task at hand, as opposed to prior approaches that reduce it uniformly across space. We evaluate our method across three distinct downstream tasks: semantic classification, segmentation, and PDE-guided physics simulation. Experimental results demonstrate that our framework achieves superior task performance with fewer views compared to commonly used baselines and prior general uncertainty-reduction techniques.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies Bayesian decision theory to pick next views that directly improve task performance in 3D reconstruction instead of cutting global uncertainty.

read the letter

The core idea is to treat next-best-view selection as a Bayesian decision problem. They put a prior over implicit surfaces, update it to a posterior using stochastic reconstruction from point clouds, and then choose the view that maximizes expected task utility rather than reducing uncertainty everywhere. This is evaluated on semantic classification, segmentation, and PDE-based physics simulation, with the claim that it reaches better task results using fewer views than uniform-uncertainty or heuristic baselines. The framing is clean and connects reconstruction uncertainty to downstream use in a principled way. It builds directly on existing stochastic surface methods without reinventing them, and the decision-theoretic step is a straightforward application that fits the setting. The multi-task evaluation is a plus because it shows the approach is not tied to one narrow use case. The experiments are the main soft spot. The abstract states superior performance, but without seeing the actual numbers, protocols, baseline implementations, or statistical tests it is difficult to gauge how large the gains are or whether the comparisons are fair. The central assumption that the posterior from the reconstruction method captures the uncertainty that matters for the task is plausible, but it could be sensitive to mismatches between the surface model and the task model. This work is aimed at people in active 3D scanning, robotics perception, and task-driven vision who care about efficient data acquisition. It is coherent enough on its own terms to deserve a serious referee, even if the results section will probably require revisions for clarity and rigor.

Referee Report

2 major / 1 minor

Summary. The paper develops a Bayesian decision-theoretic framework for task-specific next-best-view selection during 3D reconstruction from point clouds. It places a prior over implicit surfaces, computes the posterior via stochastic surface reconstruction methods, and selects the next camera pose by optimizing expected task performance (rather than uniform uncertainty reduction) for three downstream tasks: semantic classification, segmentation, and PDE-guided physics simulation. The central claim is that the resulting view sequence yields superior task metrics with fewer views than standard baselines and general uncertainty-driven methods.

Significance. If the experimental claims are substantiated with quantitative results, this would constitute a useful advance in active 3D sensing by making view selection directly sensitive to the intended use of the reconstruction. The principled use of posteriors from stochastic reconstruction to drive task-specific decisions is a clear conceptual strength and could improve data efficiency in robotics, simulation, and vision pipelines.

major comments (2)

[Experimental Evaluation] The abstract and introduction assert superior task performance on semantic classification, segmentation, and PDE simulation, yet the experimental section supplies no quantitative metrics, no description of the evaluation protocol, no baseline implementations or parameter settings, and no statistical tests. Without these elements the central claim cannot be assessed and the comparison to 'commonly used baselines and prior general uncertainty-reduction techniques' remains unevaluable.
[Method and Assumptions] The weakest assumption—that the posterior over implicit surfaces obtained from stochastic reconstruction accurately reflects the uncertainty relevant to the downstream task—is stated but not tested. No ablation or sensitivity analysis is provided to show that task performance degrades when this posterior is replaced by a uniform or heuristic uncertainty measure.

minor comments (1)

[Method] Notation for the posterior and the task-specific utility function should be introduced with explicit equations early in the method section to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive review. We address each of the major comments below and outline the revisions we will make to the manuscript.

read point-by-point responses

Referee: [Experimental Evaluation] The abstract and introduction assert superior task performance on semantic classification, segmentation, and PDE simulation, yet the experimental section supplies no quantitative metrics, no description of the evaluation protocol, no baseline implementations or parameter settings, and no statistical tests. Without these elements the central claim cannot be assessed and the comparison to 'commonly used baselines and prior general uncertainty-reduction techniques' remains unevaluable.

Authors: We agree with this assessment. The current manuscript version does not provide the necessary quantitative details in the experimental section to fully substantiate the claims. In the revised manuscript, we will include specific performance metrics for each task (such as classification accuracy, segmentation IoU, and simulation error measures), a detailed description of the evaluation protocol including the datasets used, number of trials, and metrics computation, specifications of the baseline methods and their parameter settings, and results of statistical tests (e.g., t-tests or Wilcoxon tests) to compare our method against the baselines. This will enable a proper evaluation of the superiority claims. revision: yes
Referee: [Method and Assumptions] The weakest assumption—that the posterior over implicit surfaces obtained from stochastic reconstruction accurately reflects the uncertainty relevant to the downstream task—is stated but not tested. No ablation or sensitivity analysis is provided to show that task performance degrades when this posterior is replaced by a uniform or heuristic uncertainty measure.

Authors: We acknowledge that the manuscript does not include an explicit test or ablation of this key assumption. To address this, we will add a new section or subsection with an ablation study. This study will compare the full Bayesian approach using the stochastic posterior against variants that use uniform uncertainty or heuristic measures (e.g., point density or simple variance). We will report the resulting task performance degradation to demonstrate that the posterior is indeed crucial for the observed improvements. This will strengthen the justification for the Bayesian decision-theoretic framework. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's claimed chain places a prior over implicit surfaces, obtains the posterior via external stochastic surface reconstruction methods, and applies a Bayesian decision rule to select task-optimized views. Superiority is asserted via direct experimental comparison on three downstream task metrics (classification, segmentation, physics simulation) against baselines, without any step that reduces a claimed prediction or result to a quantity fitted from the evaluation data itself, a self-citation chain, or a definitional equivalence. The framework therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the assumption that a prior over implicit surfaces can be meaningfully updated via stochastic reconstruction to produce a posterior usable for task-specific decisions; no free parameters or invented entities are identifiable from the abstract alone.

axioms (1)

domain assumption A prior distribution over the space of implicit surfaces exists and can be updated with point-cloud observations via stochastic reconstruction methods.
This is the foundational modeling choice stated in the abstract for the Bayesian pipeline.

pith-pipeline@v0.9.0 · 5474 in / 1165 out tokens · 31322 ms · 2026-05-08T16:10:43.040994+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references

[1]

Iskander Azangulov, Andrei Smolensky, Alexander Terenin, and Viacheslav Borovit- skiy. 2024a. Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces I: the compact case.Journal of Machine Learning Research (2024). Iskander Azangulov, Andrei Smolensky, Alexander Terenin, and Viacheslav Borovitskiy. 2024b. Stationary Kernels an...

2024
[2]

InAdvances in Neural Information Processing Systems

Matérn Gaussian Processes on Riemannian Manifolds. InAdvances in Neural Information Processing Systems. Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxoing Xiao, Li Yi, and Fisher Yu. 2015.ShapeNet: An Information-Rich 3D Model Repository. Technical Report....

2015
[3]

Roman Garnett

The Farthest Point Strategy for Progressive Image Sampling.IEEE Transactions on Image Processing(1997). Roman Garnett. 2023.Bayesian Optimization. Cambridge University Press. Lily Goli, Cody Reading, Silvia Sellán, Alec Jacobson, and Andrea Tagliasacchi

1997
[4]

José Miguel Hernández-Lobato, Matthew W

Measurement of Areas on a Sphere Using Fibonacci and Latitude–longitude Lattices.Mathematical Geosciences(2010). José Miguel Hernández-Lobato, Matthew W. Hoffman, and Zoubin Ghahramani

2010
[5]

Advances in Neural Information Processing Systems(2014)

Predictive Entropy Search for Efficient Global Optimization of Black-box Functions. Advances in Neural Information Processing Systems(2014). Sidhanth Holalkere, David S Bindel, Silvia Sellán, and Alexander Terenin

2014
[6]

ACM Transactions on Graphics(2013)

Screened Poisson Surface Reconstruction. ACM Transactions on Graphics(2013). Diederik P. Kingma and Jimmy Ba

2013
[7]

William E

Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent.Advances in Neural Information Processing Systems(2023). William E. Lorensen and Harvey E. Cline

2023
[8]

Bailey Miller, Hanyu Chen, Alice Lai, and Ioannis Gkioulekas

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis.Communications of the ACM(2021). Bailey Miller, Hanyu Chen, Alice Lai, and Ioannis Gkioulekas

2021
[9]

InComputer Vision and Pattern Recognition

Objects as Volumes: A Stochastic Geometry View of Opaque Solids. InComputer Vision and Pattern Recognition. SIGGRAPH Conference Papers ’26, July 19–23, 2026, Los Angeles, CA, USA. A Bayesian Approach for Task-Specific Next-Best-View Selection with Uncertain Geometry•11 Xuran Pan, Zihang Lai, Shiji Song, and Gao Huang

2026
[10]

Ulrich Pinkall and Konrad Polthier

Neural Laplacian Operator for 3D Point Clouds.ACM Transactions on Graphics (2024). Ulrich Pinkall and Konrad Polthier

2024
[11]

Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas Guibas

Computing Discrete Minimal Surfaces and their Conjugates.Experimental Mathematics(1993). Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas Guibas

1993
[12]

Carl Edward Rasmussen and Christopher K

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space.Advances in Neural Information Processing Systems(2017). Carl Edward Rasmussen and Christopher K. I. Williams. 2006.Gaussian Processes for Machine Learning. MIT Press. Nikhila Ravi, Jeremy Reizenstein, David Novotny, Taylor Gordon, Wan-Yen Lo, Justin Johnson, and Georgia Gkioxari

2017
[13]

Silvia Sellán and Alec Jacobson

The Gittins Index: A Design Principle for Decision-Making Under Uncertainty.INFORMS Tutorials in Operations Research (2025). Silvia Sellán and Alec Jacobson

2025
[14]

Nicholas Sharp and Keenan Crane

Stochastic Poisson Surface Reconstruction.ACM Transactions on Graphics(2022). Nicholas Sharp and Keenan Crane

2022
[15]

Tong Wu, Jiarui Zhang, Xiao Fu, Yuxin Wang, Liang Pan Jiawei Ren, Wayne Wu, Lei Yang, Jiaqi Wang, Chen Qian, Dahua Lin, and Ziwei Liu

Pathwise Conditioning of Gaussian Processes.Journal of Machine Learning Research(2021). Tong Wu, Jiarui Zhang, Xiao Fu, Yuxin Wang, Liang Pan Jiawei Ren, Wayne Wu, Lei Yang, Jiaqi Wang, Chen Qian, Dahua Lin, and Ziwei Liu

2021
[16]

Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li

Aerial Path Planning for Urban Geometry and Texture Co-Capture.ACM Transac- tions on Graphics(2025). Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li

2025
[17]

A Bayesian Approach for Task-Specific Next- Best-View Selection with Uncertain Geometry

Move to Understand a 3D Scene: Bridging Visual Grounding and Exploration for Efficient and Versatile Embodied Navigation. InInternational Conference on Computer Vision. SIGGRAPH Conference Papers ’26, July 19–23, 2026, Los Angeles, CA, USA. 12•Jingsen Zhu, Silvia Sellán, and Alexander Terenin Supplementary Material for “A Bayesian Approach for Task-Specif...

2026
[18]

B Additional Experiments 3D reconstruction quality using our coverage-guided acquisition function:We validate that our method is compatible with the tra- ditional 3D reconstruction task, focusing on scene completeness and reducing reconstruction error. In this experiment, we use our Chamfer-distance-based acquisition function 𝛼 (CD) 𝑓| D 𝑡 (𝜃) targeting s...

2025
[19]

Notably, for the segmentation task, the coverage-guided approach even yields a worse result than random search

From the results, our method with task-specific acquisition significantly outperforms the coverage-guided approach. Notably, for the segmentation task, the coverage-guided approach even yields a worse result than random search. We see that solely focusing on geometry coverage ignores scene semantics, prioritiz- ing large unseen geometry regions, rather th...

2023
[20]

The first two rows are the same as those in Table 1 of our main paper

The average number of cameras required for discovering all parts of the object over 80 test shapes, under different 𝑁target criteria. The first two rows are the same as those in Table 1 of our main paper. Less is better. 𝑁target 20 40 60 80 100 Ours (Task-specific)2.93 3.16 3.43 3.67 3.75 Random 3.29 3.42 3.65 3.93 4.10 Ours (Coverage) 3.23 3.46 3.75 4.11...

2026
[21]

SIGGRAPH Conference Papers ’26, July 19–23, 2026, Los Angeles, CA, USA

Comparison on the classification of the OmniObject3D dataset. SIGGRAPH Conference Papers ’26, July 19–23, 2026, Los Angeles, CA, USA

2026