pith. machine review for the scientific record. sign in

arxiv: 1702.01105 · v2 · submitted 2017-02-03 · 💻 cs.CV · cs.RO

Recognition: unknown

Joint 2D-3D-Semantic Data for Indoor Scene Understanding

Amir R. Zamir, Iro Armeni, Sasha Sax, Silvio Savarese

Authors on Pith no claims yet
classification 💻 cs.CV cs.RO
keywords datasetimagesindoorannotationsjointlarge-scaleregisteredsemantic
0
0 comments X
read the original abstract

We present a dataset of large-scale indoor spaces that provides a variety of mutually registered modalities from 2D, 2.5D and 3D domains, with instance-level semantic and geometric annotations. The dataset covers over 6,000m2 and contains over 70,000 RGB images, along with the corresponding depths, surface normals, semantic annotations, global XYZ images (all in forms of both regular and 360{\deg} equirectangular images) as well as camera information. It also includes registered raw and semantically annotated 3D meshes and point clouds. The dataset enables development of joint and cross-modal learning models and potentially unsupervised approaches utilizing the regularities present in large-scale indoor spaces. The dataset is available here: http://3Dsemantics.stanford.edu/

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 8 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI

    cs.CV 2021-09 accept novelty 8.0

    HM3D offers 1000 building-scale 3D environments that are larger and higher-fidelity than existing datasets, enabling better-performing embodied AI agents for tasks like PointGoal navigation.

  2. PointGS: Semantic-Consistent Unsupervised 3D Point Cloud Segmentation with 3D Gaussian Splatting

    cs.CV 2026-05 unverdicted novelty 7.0

    PointGS achieves semantic-consistent unsupervised 3D point cloud segmentation by using 3D Gaussian Splatting to bridge discrete points and continuous 2D images for distilling SAM semantics.

  3. BIMStruct3D: A Fully Automated Hybrid Learning Scan-to-BIM Pipeline with Integrated Topology Refinement

    cs.CV 2026-04 unverdicted novelty 7.0

    A hybrid pipeline combines semantic segmentation of point clouds with topology-aware reconstruction to generate BIM models, introduces the vIoU evaluation metric, and releases the DeKH dataset with demonstrated gains ...

  4. Holo360D: A Large-Scale Real-World Dataset with Continuous Trajectories for Advancing Panoramic 3D Reconstruction and Beyond

    cs.CV 2026-04 unverdicted novelty 7.0

    Holo360D is the first large-scale dataset providing continuous panoramic sequences with accurately aligned high-completeness depth maps and meshes for training panoramic 3D reconstruction models.

  5. STRNet: Visual Navigation with Spatio-Temporal Representation through Dynamic Graph Aggregation

    cs.CV 2026-04 conditional novelty 7.0

    STRNet improves goal-conditioned visual navigation by replacing simplistic encoders and pooling with a spatio-temporal fusion module that performs spatial graph reasoning and hybrid temporal modeling.

  6. EvObj: Learning Evolving Object-centric Representations for 3D Instance Segmentation without Scene Supervision

    cs.CV 2026-05 unverdicted novelty 5.0

    EvObj learns evolving object-centric representations for unsupervised 3D instance segmentation by dynamically refining object candidates and completing partial geometries to bridge the synthetic-to-real domain gap, ou...

  7. From Spherical to Gaussian: A Comparative Analysis of Point Cloud Cropping Strategies in Large-Scale 3D Environments

    cs.CV 2026-05 unverdicted novelty 5.0

    Gaussian and linear cropping strategies for large point clouds improve 3D neural network performance over spherical crops, especially in outdoor scenes, and achieve new state-of-the-art results.

  8. INSIGHT: Indoor Scene Intelligence from Geometric-Semantic Hierarchy Transfer for Public~Safety

    cs.CV 2026-04 unverdicted novelty 4.0

    INSIGHT transfers 2D semantic understanding from foundation models and traditional CV tools into 3D point clouds and compressed scene graphs for indoor public-safety mapping without target-domain labels.