Recognition: 2 theorem links
· Lean TheoremPicasso: Holistic Scene Reconstruction with Physics-Constrained Sampling
Pith reviewed 2026-05-16 05:52 UTC · model grok-4.3
The pith
Picasso reconstructs multi-object scenes by jointly enforcing geometry, non-penetration, and physics through contact-graph-guided rejection sampling.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Picasso is a reconstruction pipeline that builds multi-object scenes by considering geometry, non-penetration, and physics together. It relies on a fast rejection sampling method that reasons over multi-object interactions by leveraging an inferred object contact graph to guide samples. The resulting estimates are both geometrically consistent with sensor data and physically plausible, allowing direct import into simulators without manual correction.
What carries the argument
The central mechanism is physics-constrained rejection sampling guided by an inferred object contact graph that directs the sampler toward non-penetrating and stable configurations.
If this is right
- Reconstructed scenes can be imported directly into simulators to predict dynamic behavior without corrective post-processing.
- Performance gains appear in contact-rich environments where inter-object constraints dominate the solution space.
- The same pipeline improves results on established benchmarks such as YCB-V while adding physical validity guarantees.
- Digital twins built from these reconstructions support more reliable simulation-based planning for contact-rich robotic tasks.
Where Pith is reading between the lines
- Jointly optimizing the contact graph together with the pose estimates rather than inferring it first could further reduce rejection rates on ambiguous scenes.
- Extending the sampler to incorporate temporal consistency across video frames would allow reconstruction of moving scenes without separate tracking.
- The physical-plausibility metric introduced in the benchmark could serve as a training signal for learning-based reconstructors that currently optimize only geometric error.
- Scaling the approach to scenes with dozens of objects will likely require more efficient graph inference or learned proposal distributions to keep the rejection sampler tractable.
Load-bearing premise
The inferred object contact graph is accurate enough to steer sampling toward valid solutions without excluding good configurations or requiring an impractical number of rejections.
What would settle it
A controlled experiment in which the contact-graph inference is deliberately corrupted on an otherwise solvable scene and the sampler either fails to return any valid configuration within a fixed budget or returns only interpenetrating or unstable arrangements.
Figures
read the original abstract
In the presence of occlusions and measurement noise, geometrically accurate scene reconstructions -- which fit the sensor data -- can still be physically incorrect. For instance, when estimating the poses and shapes of objects in the scene and importing the resulting estimates into a simulator, small errors might translate to implausible configurations including object interpenetration or unstable equilibrium. This makes it difficult to predict the dynamic behavior of the scene using a digital twin, an important step in simulation-based planning and control of contact-rich behaviors. In this paper, we posit that object pose and shape estimation requires reasoning holistically over the scene (instead of reasoning about each object in isolation), accounting for object interactions and physical plausibility. Towards this goal, our first contribution is Picasso, a physics-constrained reconstruction pipeline that builds multi-object scene reconstructions by considering geometry, non-penetration, and physics. Picasso relies on a fast rejection sampling method that reasons over multi-object interactions, leveraging an inferred object contact graph to guide samples. Second, we propose the Picasso dataset, a collection of 10 contact-rich real-world scenes with ground truth annotations, as well as a metric to quantify physical plausibility, which we open-source as part of our benchmark. Finally, we provide an extensive evaluation of Picasso on our newly introduced dataset and on the YCB-V dataset, and show it largely outperforms the state of the art while providing reconstructions that are both physically plausible and more aligned with human intuition.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that Picasso, a physics-constrained scene reconstruction pipeline, produces physically plausible multi-object reconstructions by using fast rejection sampling guided by an inferred object contact graph. It introduces a new 10-scene real-world dataset with ground-truth annotations and a physical plausibility metric, demonstrating outperformance over prior methods on this dataset and on YCB-V while yielding results more aligned with human intuition.
Significance. If the results hold, the work could advance simulation-based planning and control by enabling more reliable digital twins for contact-rich scenes. The new dataset and plausibility metric are valuable open contributions that address a gap in evaluating physical correctness beyond geometric fit. The holistic treatment of object interactions via the contact graph is a promising direction, though its robustness remains to be fully substantiated.
major comments (3)
- [§5] §5 (Experiments): No ablation study isolates the contribution of the inferred contact graph to sampling efficiency or reconstruction quality. Without removing or replacing this component, it is impossible to determine whether the reported gains in physical plausibility derive from the graph-guided rejection sampling or from other elements of the pipeline.
- [§5.2] §5.2 and Table 2: The evaluation provides no quantitative analysis of contact-graph inference accuracy, rejection rates, or failure cases in contact-rich scenes. This leaves the central assumption—that the graph inferred from noisy geometry reliably guides sampling without excessive rejections or exclusion of valid configurations—unsupported by direct evidence.
- [§5.1] §5.1: Baseline comparisons lack full details on implementation, hyper-parameter tuning, and error bars on the new plausibility metric. The claim of outperformance is therefore only moderately supported, as variance and reproducibility cannot be assessed.
minor comments (2)
- [Figure 3] Figure 3 and §4.2: The contact-graph visualization would benefit from explicit annotation of false-positive/negative edges to illustrate inference errors on real data.
- [§3] §3: Notation for the rejection-sampling acceptance probability could be clarified with a short pseudocode block to avoid ambiguity in the multi-object interaction term.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We agree that additional ablations, quantitative analyses of the contact graph, and greater transparency in baseline comparisons will strengthen the paper. We will incorporate these elements in the revised version. Below we address each major comment point by point.
read point-by-point responses
-
Referee: [§5] §5 (Experiments): No ablation study isolates the contribution of the inferred contact graph to sampling efficiency or reconstruction quality. Without removing or replacing this component, it is impossible to determine whether the reported gains in physical plausibility derive from the graph-guided rejection sampling or from other elements of the pipeline.
Authors: We agree that an ablation isolating the contact graph's contribution is valuable. In the revised manuscript, we will add an ablation comparing the full Picasso pipeline to a variant using rejection sampling without contact-graph guidance. We will report differences in sampling efficiency (rejection rates and runtime) and reconstruction quality (geometric accuracy and physical plausibility metrics) on the Picasso dataset and YCB-V to clarify the graph's role. revision: yes
-
Referee: [§5.2] §5.2 and Table 2: The evaluation provides no quantitative analysis of contact-graph inference accuracy, rejection rates, or failure cases in contact-rich scenes. This leaves the central assumption—that the graph inferred from noisy geometry reliably guides sampling without excessive rejections or exclusion of valid configurations—unsupported by direct evidence.
Authors: We will add a new analysis subsection in the revision. This will include quantitative metrics on contact-graph inference accuracy (precision/recall against ground-truth contacts from our dataset annotations), average rejection rates during sampling, and a discussion of observed failure cases in contact-rich scenes. These results will directly support the reliability of the graph-guided approach. revision: yes
-
Referee: [§5.1] §5.1: Baseline comparisons lack full details on implementation, hyper-parameter tuning, and error bars on the new plausibility metric. The claim of outperformance is therefore only moderately supported, as variance and reproducibility cannot be assessed.
Authors: We acknowledge the need for greater reproducibility. In the revised manuscript, we will expand the baseline section with full implementation details, specific hyper-parameter values and tuning procedures for each method, and error bars (standard deviations over multiple runs) for the physical plausibility metric on both datasets. This will allow proper assessment of variance and strengthen the outperformance claims. revision: yes
Circularity Check
New rejection sampling and dataset avoid circular derivation
full rationale
The paper introduces Picasso as a novel physics-constrained pipeline relying on rejection sampling guided by an inferred contact graph, plus a new 10-scene dataset and physical plausibility metric. No equations or claims reduce by construction to prior fitted parameters; evaluations on the new dataset and YCB-V provide independent content. Minor self-citations may exist for background but are not load-bearing for the central reconstruction claims.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Rigid-body non-penetration and equilibrium constraints are sufficient to define physical plausibility for the target scenes
Forward citations
Cited by 1 Pith paper
-
Reconstruction by Generation: 3D Multi-Object Scene Reconstruction from Sparse Observations
RecGen achieves state-of-the-art 3D multi-object scene reconstruction from sparse RGB-D views by combining compositional synthetic scene generation with strong 3D shape priors, outperforming SAM3D by 30%+ in shape qua...
Reference graph
Works this paper leans on
-
[1]
Aditya Agarwal, Gaurav Singh, Bipasha Sen, Tom ´as Lozano-P´erez, and Leslie Pack Kaelbling. Scenecom- plete: Open-world 3d scene completion in cluttered real world environments for robot manipulation.IEEE Robotics and Automation Letters, 11(1):482–489, 2025
work page 2025
-
[2]
Amodal 3d reconstruction for robotic manipulation via stability and connectivity
William Agnew, Christopher Xie, Aaron Walsman, Oc- tavian Murad, Yubo Wang, Pedro Domingos, and Sid- dhartha Srinivasa. Amodal 3d reconstruction for robotic manipulation via stability and connectivity. InCon- ference on Robot Learning (CoRL), pages 1498–1508. PMLR, 2021
work page 2021
-
[3]
A general and adaptive robust loss function
Jonathan T Barron. A general and adaptive robust loss function. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4331– 4339, 2019
work page 2019
-
[4]
Matthias Bauer, Emilien Dupont, Andy Brock, Dan Rosenbaum, Jonathan Richard Schwarz, and Hyunjik Kim. Spatial functa: Scaling functa to imagenet classifi- cation and generation.arXiv preprint arXiv:2302.03130, 2023
-
[5]
Bibit Bianchini, Minghan Zhu, Mengti Sun, Bowen Jiang, Camillo J Taylor, and Michael Posa. Vysics: Ob- ject reconstruction under occlusion by fusing vision and contact-rich physics.arXiv preprint arXiv:2504.18719, 2025
-
[6]
Matthieu Blanke, Yongquan Qu, Sara Shamekh, and Pierre Gentine. Strictly constrained generative modeling via split augmented langevin sampling.arXiv preprint arXiv:2505.18017, 2025
-
[7]
T. M. Breuel. Implementation techniques for geomet- ric branch-and-bound matching methods.Comput. Vis. Image Underst., 90(3):258–294, 2003
work page 2003
-
[8]
ShapeNet: An Information-Rich 3D Model Repository
Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. Shapenet: An information-rich 3d model repository.arXiv preprint arXiv:1512.03012, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[9]
SAM 3D: 3Dfy Anything in Images
Xingyu Chen, Fu-Jen Chu, Pierre Gleize, Kevin J Liang, Alexander Sax, Hao Tang, Weiyao Wang, Michelle Guo, Thibaut Hardin, Xiang Li, et al. Sam 3d: 3dfy anything in images.arXiv preprint arXiv:2511.16624, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[10]
Blender Foundation, Stichting Blender Foundation, Amsterdam, 2018
Blender Online Community.Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam, 2018. URL http:// www.blender.org
work page 2018
-
[11]
G.F. Cooper. The computational complexity of proba- bilistic inference using Bayesian belief networks.Arti- ficial Intelligence, 42(2-3):393–405, 1990. ISSN 0004- 3702
work page 1990
-
[12]
Matt Deitke, Ruoshi Liu, Matthew Wallingford, Huong Ngo, Oscar Michel, Aditya Kusupati, Alan Fan, Christian Laforte, Vikram V oleti, Samir Yitzhak Gadre, et al. Objaverse-xl: A universe of 10m+ 3d objects.Advances in Neural Information Processing Systems (NeurIPS), 36: 35799–35813, 2023
work page 2023
-
[13]
Blenderproc: Reducing the re- ality gap with photorealistic rendering
Maximilian Denninger, Martin Sundermeyer, Dominik Winkelbauer, Dmitry Olefir, Tomas Hodan, Youssef Zi- dan, Mohamad Elbadrawy, Markus Knauer, Harinandan Katam, and Ahsan Lodhi. Blenderproc: Reducing the re- ality gap with photorealistic rendering. In16th Robotics: Science and Systems, RSS 2020, Workshops, 2020
work page 2020
-
[14]
Google scanned objects: A high-quality dataset of 3d scanned household items
Laura Downs, Anthony Francis, Nate Koenig, Brandon Kinman, Ryan Hickman, Krista Reymann, Thomas B McHugh, and Vincent Vanhoucke. Google scanned objects: A high-quality dataset of 3d scanned household items. InIEEE Intl. Conf. on Robotics and Automation (ICRA), pages 2553–2560. IEEE, 2022
work page 2022
-
[15]
Emilien Dupont, Hyunjik Kim, SM Eslami, Danilo Rezende, and Dan Rosenbaum. From data to functa: Your data point is a function and you can treat it like one.arXiv preprint arXiv:2201.12204, 2022
-
[16]
M. Fischler and R. Bolles. Random sample consensus: a paradigm for model fitting with application to image analysis and automated cartography.Commun. ACM, 24: 381–395, 1981
work page 1981
-
[17]
Diffusion models for constrained domains.arXiv preprint arXiv:2304.05364, 2023
Nic Fishman, Leo Klarner, Valentin De Bortoli, Emile Mathieu, and Michael Hutchinson. Diffusion models for constrained domains.arXiv preprint arXiv:2304.05364, 2023
-
[18]
Yang Fu and Xiaolong Wang. Category-level 6d object pose estimation in the wild: A semi-supervised learning approach and a new dataset.Advances in Neural Infor- mation Processing Systems (NeurIPS), 35:27469–27483, 2022
work page 2022
-
[19]
N. Gothoskar, M. Cusumano-Towner, B. Zinberg, M. Ghavamizadeh, F. Pollok, A. Garrett, J.B. Tenen- baum, D. Gutfreund, and V .K. Mansinghka. 3DP3: 3D scene perception via probabilistic programming. InarXiv preprint: 2111.00312, 2021
-
[20]
Bayes3d: fast learning and inference in structured generative models of 3d objects and scenes
Nishad Gothoskar, Matin Ghavami, Eric Li, Aidan Curtis, Michael Noseworthy, Karen Chung, Brian Pat- ton, William T Freeman, Joshua B Tenenbaum, Mirko Klukas, et al. Bayes3d: fast learning and inference in structured generative models of 3d objects and scenes. arXiv preprint arXiv:2312.08715, 2023
-
[21]
Nonconvex rigid bodies with stacking.ACM transactions on graphics (TOG), 22(3):871–878, 2003
Eran Guendelman, Robert Bridson, and Ronald Fedkiw. Nonconvex rigid bodies with stacking.ACM transactions on graphics (TOG), 22(3):871–878, 2003
work page 2003
-
[22]
Realistic animation of rigid bodies.ACM Siggraph computer graphics, 22(4):299–308, 1988
James K Hahn. Realistic animation of rigid bodies.ACM Siggraph computer graphics, 22(4):299–308, 1988
work page 1988
-
[23]
R.I. Hartley and F. Kahl. Global optimization through rotation space search.Intl. J. of Computer Vision, 82(1): 64–79, 2009
work page 2009
-
[24]
Zero-shot multi-object scene completion
Shun Iwase, Katherine Liu, Vitor Guizilini, Adrien Gaidon, Kris Kitani, Rares ¸ Ambrus ¸, and Sergey Za- kharov. Zero-shot multi-object scene completion. In European Conf. on Computer Vision (ECCV), pages 96–
-
[25]
Ze- rograsp: Zero-shot shape reconstruction enabled robotic grasping
Shun Iwase, Muhammad Zubair Irshad, Katherine Liu, Vitor Guizilini, Robert Lee, Takuya Ikeda, Ayako Amma, Koichi Nishiwaki, Kris Kitani, Rares Ambrus, et al. Ze- rograsp: Zero-shot shape reconstruction enabled robotic grasping. InIEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 17405–17415, 2025
work page 2025
- [26]
-
[27]
libigl: A simple C++ geometry processing library, 2018
Alec Jacobson, Daniele Panozzo, et al. libigl: A simple C++ geometry processing library, 2018. https://libigl.github.io/
work page 2018
-
[28]
Hanxiao Jiang, Hao-Yu Hsu, Kaifeng Zhang, Hsin-Ni Yu, Shenlong Wang, and Yunzhu Li. Phystwin: Physics- informed reconstruction and simulation of deformable objects from videos.arXiv preprint arXiv:2503.17973, 2025
-
[29]
Kenji Koide. small gicp: Efficient and parallel algo- rithms for point cloud registration.Journal of Open Source Software, 9(100):6948, August 2024. doi: 10. 21105/joss.06948
work page 2024
- [30]
-
[31]
Megapose: 6d pose estimation of novel objects via render & compare
Yann Labb ´e, Lucas Manuelli, Arsalan Mousavian, Stephen Tyree, Stan Birchfield, Jonathan Tremblay, Justin Carpentier, Mathieu Aubry, Dieter Fox, and Josef Sivic. Megapose: 6d pose estimation of novel objects via render & compare. 2022
work page 2022
-
[32]
H. Lim, D. Kim, G. Shin, J. Shi, I. Vizzo, H. Myung, J. Park, and L. Carlone. KISS-Matcher: Fast and robust point cloud registration revisited. InIEEE Intl. Conf. on Robotics and Automation (ICRA), 2025
work page 2025
-
[33]
Xingyu Liu, Ruida Zhang, Chenyangguang Zhang, Bowen Fu, Jiwen Tang, Xiquan Liang, Jingyi Tang, Xiao- tian Cheng, Yukang Zhang, Gu Wang, and Xiangyang Ji. Gdrnpp. https://github.com/shanice-l/gdrnpp bop2022, 2022
work page 2022
-
[34]
Physpose: Refining 6d object poses with physical constraints.arXiv preprint arXiv:2503.23587, 2025
Martin Malenick `y, Martin C´ıfka, M´ed´eric Fourmy, Louis Montaut, Justin Carpentier, Josef Sivic, and Vladimir Petrik. Physpose: Refining 6d object poses with physical constraints.arXiv preprint arXiv:2503.23587, 2025
-
[35]
Junfeng Ni, Yixin Chen, Bohan Jing, Nan Jiang, Bin Wang, Bo Dai, Puhao Li, Yixin Zhu, Song-Chun Zhu, and Siyuan Huang. Phyrecon: Physically plausible neural scene reconstruction.Advances in Neural Information Processing Systems (NeurIPS), 37:25747–25780, 2024
work page 2024
-
[36]
Adam Nordenh ¨og and Akash Sharma. Score-based con- strained generative modeling via langevin diffusions with boundary conditions.arXiv preprint arXiv:2510.23985, 2025
-
[37]
´A. Parra Bustos, T. J. Chin, and D. Suter. Fast rotation search with stereographic projections for 3d registration. InIEEE Conf. on Computer Vision and Pattern Recog- nition (CVPR), pages 3930–3937, 2014
work page 2014
-
[38]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zem- ing Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library.Advances in neural information processing systems, 32, 2019
work page 2019
-
[39]
G. Pavlakos, X. Zhou, A. Chan, K. Derpanis, and K. Daniilidis. 6-dof object pose from semantic keypoints. InIEEE Intl. Conf. on Robotics and Automation (ICRA), 2017
work page 2017
-
[40]
Li Puyin, Tiange Xiang, Ella Mao, Shirley Wei, Xinye Chen, Adnan Masood, Li Fei-Fei, and Ehsan Adeli. Quantiphy: A quantitative benchmark evaluating physical reasoning abilities of vision-language models.arXiv preprint arXiv:2512.19526, 2025
-
[41]
N. Ravi, V . Gabeur, Y-T. Hu, R. Hu, C. Ryali, T. Ma, H. Khedr, R. R¨adle, C. Rolland, L. Gustafson, E. Mintun, J. Pan, K.V . Alwala, N. Carion, C-Y . Wu, R. Girshick, P. Doll ´ar, and C. Feichtenhofer. SAM 2: Segment anything in images and videos, 2024. URL https://arxiv. org/abs/2408.00714
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[42]
Common objects in 3d: Large-scale learning and eval- uation of real-life 3d category reconstruction
Jeremy Reizenstein, Roman Shapovalov, Philipp Henzler, Luca Sbordone, Patrick Labatut, and David Novotny. Common objects in 3d: Large-scale learning and eval- uation of real-life 3d category reconstruction. InIntl. Conf. on Computer Vision (ICCV), pages 10901–10911, 2021
work page 2021
-
[43]
Aleksandr Segal, Dirk Haehnel, and Sebastian Thrun. Generalized ICP. InRobotics: Science and Systems (RSS), Jun. 2009. doi: 10.15607/RSS.2009.V .021
- [44]
-
[45]
J. Shi, R. Talak, H. Zhang, D. Jin, and L. Carlone. CRISP: Object pose and shape estimation with test-time adaptation. InIEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2025
work page 2025
- [46]
-
[47]
Shape prior deformation for categorical 6d object pose and size estimation
Meng Tian, Marcelo H Ang, and Gim Hee Lee. Shape prior deformation for categorical 6d object pose and size estimation. InEuropean Conf. on Computer Vision (ECCV), pages 530–546. Springer, 2020
work page 2020
-
[48]
H. Wang, S. Sridhar, J. Huang, J. Valentin, S. Song, and L. Guibas. Normalized object coordinate space for category-level 6d object pose and size estimation. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 2642–2651, 2019
work page 2019
-
[49]
Vggt: Visual geometry grounded transformer
Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. Vggt: Visual geometry grounded transformer. InIEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 5294–5306, 2025
work page 2025
-
[50]
Bundlesdf: Neural 6-dof tracking and 3d reconstruction of unknown objects
Bowen Wen, Jonathan Tremblay, Valts Blukis, Stephen Tyree, Thomas M ¨uller, Alex Evans, Dieter Fox, Jan Kautz, and Stan Birchfield. Bundlesdf: Neural 6-dof tracking and 3d reconstruction of unknown objects. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 606–617, 2023
work page 2023
-
[51]
FoundationPose: Unified 6D Pose Estimation and Track- ing of Novel Objects
Bowen Wen, Wei Yang, Jan Kautz, and Stan Birchfield. FoundationPose: Unified 6D Pose Estimation and Track- ing of Novel Objects . In2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 17868–17879, Los Alamitos, CA, USA, June 2024. IEEE Computer Society. doi: 10.1109/CVPR52733. 2024.01692. URL https://doi.ieeecomputersociety.o...
-
[52]
Jiajun Wu, Ilker Yildirim, Joseph J Lim, Bill Freeman, and Josh Tenenbaum. Galileo: Perceiving physical object properties by integrating a physics engine with deep learning.Advances in Neural Information Processing Systems (NeurIPS), 28, 2015
work page 2015
-
[53]
Tong Wu, Jiarui Zhang, Xiao Fu, Yuxin Wang, Jiawei Ren, Liang Pan, Wayne Wu, Lei Yang, Jiaqi Wang, Chen Qian, et al. Omniobject3d: Large-vocabulary 3d object dataset for realistic perception, reconstruction and generation. InIEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 803–814, 2023
work page 2023
-
[54]
PoseCNN: A convolutional neural network for 6D object pose estimation in cluttered scenes
Yu Xiang, Tanner Schmidt, Venkatraman Narayanan, and Dieter Fox. PoseCNN: A convolutional neural network for 6D object pose estimation in cluttered scenes. In Robotics: Science and Systems (RSS), 2018
work page 2018
- [55]
-
[56]
J. Yang, H. Li, D. Campbell, and Y . Jia. Go-ICP: A globally optimal solution to 3D ICP point-set registration. IEEE Trans. Pattern Anal. Machine Intell., 38(11):2241– 2254, November 2016. ISSN 0162-8828
work page 2016
-
[57]
Wen Yang, Zhixian Xie, Xuechao Zhang, Heni Ben Amor, Shan Lin, and Wanxin Jin. Twintrack: Bridg- ing vision and contact physics for real-time track- ing of unknown dynamic objects.arXiv preprint arXiv:2505.22882, 2025
-
[58]
Kaixin Yao, Longwen Zhang, Xinhao Yan, Yan Zeng, Qixuan Zhang, Lan Xu, Wei Yang, Jiayuan Gu, and Jingyi Yu. Cast: Component-aligned 3d scene reconstruc- tion from an rgb image.ACM Transactions on Graphics (TOG), 44(4):1–19, 2025
work page 2025
-
[59]
Xihang Yu, Rajat Talak, Jingnan Shi, Ulrich Viereck, Igor Gilitschenski, and Luca Carlone. Box pose and shape estimation and domain adaptation for large-scale warehouse automation.arXiv preprint arXiv:2507.00984, 2025
-
[60]
Non-penetration iterative closest points for single-view multi-object 6d pose estimation
Mengchao Zhang and Kris Hauser. Non-penetration iterative closest points for single-view multi-object 6d pose estimation. InIEEE Intl. Conf. on Robotics and Automation (ICRA), pages 1520–1526. IEEE, 2022
work page 2022
- [61]
-
[62]
3d neural embedding likelihood: Probabilistic inverse graphics for robust 6d pose estimation
Guangyao Zhou, Nishad Gothoskar, Lirui Wang, Joshua B Tenenbaum, Dan Gutfreund, Miguel L ´azaro- Gredilla, Dileep George, and Vikash K Mansinghka. 3d neural embedding likelihood: Probabilistic inverse graphics for robust 6d pose estimation. InIntl. Conf. on Computer Vision (ICCV), pages 21625–21636, 2023
work page 2023
-
[63]
Object reconstruction under occlusion with generative priors and contact-induced constraints
Minghan Zhu, Zhiyi Wang, Qihang Sun, Maani Ghaffari, and Michael Posa. Object reconstruction under occlusion with generative priors and contact-induced constraints. arXiv preprint arXiv:2512.05079, 2025. Given the masks of objects, RGB image (second to the last) and depth map (last), give me contact dependency graph (adjacency list). Use the indices of th...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.