citation dossier

Matter- port3d: Learning from rgb-d data in indoor environments

A · 2017 · arXiv 1709.06158

17Pith papers citing it

18reference links

cs.CVtop field · 8 papers

UNVERDICTEDtop verdict bucket · 17 papers

This arXiv-backed work is queued for full Pith review when it crosses the high-inbound sweep. That review runs reader · skeptic · desk-editor · referee · rebuttal · circularity · lean confirmation · RS check · pith extraction.

read on arXiv PDF

why this work matters in Pith

Pith has found this work in 17 reviewed papers. Its strongest current cluster is cs.CV (8 papers). The largest review-status bucket among citing papers is UNVERDICTED (17 papers). For highly cited works, this page shows a dossier first and a bounded explorer second; it never tries to render every citing paper at once.

representative citing papers

Mind the Gap: Geometrically Accurate Generative Reconstruction from Disjoint Views

cs.CV · 2026-05-08 · unverdicted · novelty 8.0

GLADOS reconstructs 3D geometry from disjoint views by generating intermediate perspectives, performing robust coarse alignment that tolerates generative inconsistencies, and iteratively expanding context for consistency.

Beyond Isolation: A Unified Benchmark for General-Purpose Navigation

cs.RO · 2026-05-10 · unverdicted · novelty 7.0

OmniNavBench is a unified benchmark for general-purpose navigation featuring composite multi-skill instructions, support for humanoid, quadrupedal and wheeled robots, and 1779 human teleoperated trajectories across 170 environments.

Holo360D: A Large-Scale Real-World Dataset with Continuous Trajectories for Advancing Panoramic 3D Reconstruction and Beyond

cs.CV · 2026-04-24 · unverdicted · novelty 7.0

Holo360D is the first large-scale dataset providing continuous panoramic sequences with accurately aligned high-completeness depth maps and meshes for training panoramic 3D reconstruction models.

NavOL: Navigation Policy with Online Imitation Learning

cs.RO · 2026-05-12 · unverdicted · novelty 6.0

NavOL collects expert trajectory labels online from a global planner during policy rollouts in simulation to train a diffusion navigation policy, mitigating distribution shift and improving performance on visual navigation tasks.

Beyond Thinking: Imagining in 360$^\circ$ for Humanoid Visual Search

cs.CV · 2026-05-09 · unverdicted · novelty 6.0

Imagining in 360° decouples visual search into a single-step probabilistic semantic layout predictor and an actor, removing the need for multi-turn CoT reasoning and trajectory annotations while improving efficiency in 360° environments.

Plug-and-Play Label Map Diffusion for Universal Goal-Oriented Navigation

cs.RO · 2026-05-07 · unverdicted · novelty 6.0

PLMD applies a denoising diffusion model to predict labels for unknown map regions, allowing goal localization in unexplored environments by substituting completed labels into existing navigation pipelines.

SpaAct: Spatially-Activated Transition Learning with Curriculum Adaptation for Vision-Language Navigation

cs.CV · 2026-04-30 · unverdicted · novelty 6.0

SpaAct activates spatial awareness in VLMs using action retrospection, future frame prediction, and progressive curriculum learning to reach SOTA on VLN-CE benchmarks.

Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective

cs.CV · 2026-04-15 · unverdicted · novelty 6.0

The paper proposes a problem-driven taxonomy for feed-forward 3D scene modeling that groups methods by five core challenges: feature enhancement, geometry awareness, model efficiency, augmentation strategies, and temporal-aware modeling.

OVAL: Open-Vocabulary Augmented Memory Model for Lifelong Object Goal Navigation

cs.RO · 2026-04-14 · unverdicted · novelty 6.0

OVAL introduces an open-vocabulary memory model with structured descriptors and multi-value frontier scoring to enable efficient lifelong object goal navigation in unseen settings.

ReplicateAnyScene: Zero-Shot Video-to-3D Composition via Textual-Visual-Spatial Alignment

cs.CV · 2026-04-12 · unverdicted · novelty 6.0

ReplicateAnyScene performs fully automated zero-shot video-to-compositional-3D reconstruction by cascading alignments of generic priors from vision foundation models across textual, visual, and spatial dimensions.

From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation

cs.GR · 2026-04-26 · unverdicted · novelty 5.0 · 2 refs

The paper surveys 3D asset generation methods and organizes them around the full production pipeline to assess which outputs meet engine-level requirements for interactive applications.

Dual-Anchoring: Addressing State Drift in Vision-Language Navigation

cs.CV · 2026-04-19 · unverdicted · novelty 5.0

Dual-Anchoring adds explicit progress tokens and retrospective landmark verification to VLN agents, cutting state drift and lifting success rate 15.2% overall with 24.7% gains on long trajectories.

Think before Go: Hierarchical Reasoning for Image-goal Navigation

cs.RO · 2026-04-19 · unverdicted · novelty 5.0

HRNav decomposes image-goal navigation into VLM-based short-horizon planning and RL-based execution with a wandering suppression penalty to improve performance in complex unseen settings.

MV3DIS: Multi-View Mask Matching via 3D Guides for Zero-Shot 3D Instance Segmentation

cs.CV · 2026-04-10 · unverdicted · novelty 5.0

MV3DIS uses 3D-guided mask matching and depth consistency to produce more consistent multi-view 2D masks that refine into accurate zero-shot 3D instances.

OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence

cs.CL · 2026-04-08 · unverdicted · novelty 5.0

OpenSpatial supplies a principled open-source data engine and 3-million-sample dataset that raises spatial-reasoning model performance by an average of 19 percent on benchmarks.

Audio Spatially-Guided Fusion for Audio-Visual Navigation

cs.SD · 2026-04-02 · unverdicted · novelty 5.0

Audio Spatially-Guided Fusion improves generalization in audio-visual navigation on unheard sound sources by extracting spatial audio features and adaptively fusing them with visual data.

Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation

cs.GR · 2026-05-05 · unverdicted · novelty 4.0

JoyAI-Image unifies visual understanding, generation, and editing in one model and claims stronger spatial intelligence through bidirectional perception-generation loops.

citing papers explorer

Showing 17 of 17 citing papers.

Mind the Gap: Geometrically Accurate Generative Reconstruction from Disjoint Views cs.CV · 2026-05-08 · unverdicted · none · ref 23
GLADOS reconstructs 3D geometry from disjoint views by generating intermediate perspectives, performing robust coarse alignment that tolerates generative inconsistencies, and iteratively expanding context for consistency.
Beyond Isolation: A Unified Benchmark for General-Purpose Navigation cs.RO · 2026-05-10 · unverdicted · none · ref 5
OmniNavBench is a unified benchmark for general-purpose navigation featuring composite multi-skill instructions, support for humanoid, quadrupedal and wheeled robots, and 1779 human teleoperated trajectories across 170 environments.
Holo360D: A Large-Scale Real-World Dataset with Continuous Trajectories for Advancing Panoramic 3D Reconstruction and Beyond cs.CV · 2026-04-24 · unverdicted · none · ref 8
Holo360D is the first large-scale dataset providing continuous panoramic sequences with accurately aligned high-completeness depth maps and meshes for training panoramic 3D reconstruction models.
NavOL: Navigation Policy with Online Imitation Learning cs.RO · 2026-05-12 · unverdicted · none · ref 2
NavOL collects expert trajectory labels online from a global planner during policy rollouts in simulation to train a diffusion navigation policy, mitigating distribution shift and improving performance on visual navigation tasks.
Beyond Thinking: Imagining in 360$^\circ$ for Humanoid Visual Search cs.CV · 2026-05-09 · unverdicted · none · ref 101
Imagining in 360° decouples visual search into a single-step probabilistic semantic layout predictor and an actor, removing the need for multi-turn CoT reasoning and trajectory annotations while improving efficiency in 360° environments.
Plug-and-Play Label Map Diffusion for Universal Goal-Oriented Navigation cs.RO · 2026-05-07 · unverdicted · none · ref 2
PLMD applies a denoising diffusion model to predict labels for unknown map regions, allowing goal localization in unexplored environments by substituting completed labels into existing navigation pipelines.
SpaAct: Spatially-Activated Transition Learning with Curriculum Adaptation for Vision-Language Navigation cs.CV · 2026-04-30 · unverdicted · none · ref 9
SpaAct activates spatial awareness in VLMs using action retrospection, future frame prediction, and progressive curriculum learning to reach SOTA on VLN-CE benchmarks.
Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective cs.CV · 2026-04-15 · unverdicted · none · ref 274
The paper proposes a problem-driven taxonomy for feed-forward 3D scene modeling that groups methods by five core challenges: feature enhancement, geometry awareness, model efficiency, augmentation strategies, and temporal-aware modeling.
OVAL: Open-Vocabulary Augmented Memory Model for Lifelong Object Goal Navigation cs.RO · 2026-04-14 · unverdicted · none · ref 36
OVAL introduces an open-vocabulary memory model with structured descriptors and multi-value frontier scoring to enable efficient lifelong object goal navigation in unseen settings.
ReplicateAnyScene: Zero-Shot Video-to-3D Composition via Textual-Visual-Spatial Alignment cs.CV · 2026-04-12 · unverdicted · none · ref 8
ReplicateAnyScene performs fully automated zero-shot video-to-compositional-3D reconstruction by cascading alignments of generic priors from vision foundation models across textual, visual, and spatial dimensions.
From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation cs.GR · 2026-04-26 · unverdicted · none · ref 96 · 2 links
The paper surveys 3D asset generation methods and organizes them around the full production pipeline to assess which outputs meet engine-level requirements for interactive applications.
Dual-Anchoring: Addressing State Drift in Vision-Language Navigation cs.CV · 2026-04-19 · unverdicted · none · ref 39
Dual-Anchoring adds explicit progress tokens and retrospective landmark verification to VLN agents, cutting state drift and lifting success rate 15.2% overall with 24.7% gains on long trajectories.
Think before Go: Hierarchical Reasoning for Image-goal Navigation cs.RO · 2026-04-19 · unverdicted · none · ref 52
HRNav decomposes image-goal navigation into VLM-based short-horizon planning and RL-based execution with a wandering suppression penalty to improve performance in complex unseen settings.
MV3DIS: Multi-View Mask Matching via 3D Guides for Zero-Shot 3D Instance Segmentation cs.CV · 2026-04-10 · unverdicted · none · ref 3
MV3DIS uses 3D-guided mask matching and depth consistency to produce more consistent multi-view 2D masks that refine into accurate zero-shot 3D instances.
OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence cs.CL · 2026-04-08 · unverdicted · none · ref 8
OpenSpatial supplies a principled open-source data engine and 3-million-sample dataset that raises spatial-reasoning model performance by an average of 19 percent on benchmarks.
Audio Spatially-Guided Fusion for Audio-Visual Navigation cs.SD · 2026-04-02 · unverdicted · none · ref 14
Audio Spatially-Guided Fusion improves generalization in audio-visual navigation on unheard sound sources by extracting spatial audio features and adaptively fusing them with visual data.
Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation cs.GR · 2026-05-05 · unverdicted · none · ref 14
JoyAI-Image unifies visual understanding, generation, and editing in one model and claims stronger spatial intelligence through bidirectional perception-generation loops.

Matter- port3d: Learning from rgb-d data in indoor environments

why this work matters in Pith

fields

years

verdicts

representative citing papers

citing papers explorer