pith. sign in

arxiv: 2110.00087 · v1 · pith:YFMA4J6Gnew · submitted 2021-09-30 · 💻 cs.CV · cs.AI· cs.LG· cs.RO

Seeing Glass: Joint Point Cloud and Depth Completion for Transparent Objects

classification 💻 cs.CV cs.AIcs.LGcs.RO
keywords depthtransparentobjectscompletiondatasetrgb-dtransparenetautomated
0
0 comments X
read the original abstract

The basis of many object manipulation algorithms is RGB-D input. Yet, commodity RGB-D sensors can only provide distorted depth maps for a wide range of transparent objects due light refraction and absorption. To tackle the perception challenges posed by transparent objects, we propose TranspareNet, a joint point cloud and depth completion method, with the ability to complete the depth of transparent objects in cluttered and complex scenes, even with partially filled fluid contents within the vessels. To address the shortcomings of existing transparent object data collection schemes in literature, we also propose an automated dataset creation workflow that consists of robot-controlled image collection and vision-based automatic annotation. Through this automated workflow, we created Toronto Transparent Objects Depth Dataset (TODD), which consists of nearly 15000 RGB-D images. Our experimental evaluation demonstrates that TranspareNet outperforms existing state-of-the-art depth completion methods on multiple datasets, including ClearGrasp, and that it also handles cluttered scenes when trained on TODD. Code and dataset will be released at https://www.pair.toronto.edu/TranspareNet/

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Trans2Occ: Voxel Occupancy Estimation and Grasp for Transparent Objects from Simulation to Reality

    cs.RO 2026-06 unverdicted novelty 4.0

    A simulation-trained model predicts voxel occupancy from single RGB views for transparent object grasping and transfers to real robotic setups without fine-tuning.