PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning
read the original abstract
State-of-the-art approaches to ObjectGoal navigation rely on reinforcement learning and typically require significant computational resources and time for learning. We propose Potential functions for ObjectGoal Navigation with Interaction-free learning (PONI), a modular approach that disentangles the skills of `where to look?' for an object and `how to navigate to (x, y)?'. Our key insight is that `where to look?' can be treated purely as a perception problem, and learned without environment interactions. To address this, we propose a network that predicts two complementary potential functions conditioned on a semantic map and uses them to decide where to look for an unseen object. We train the potential function network using supervised learning on a passive dataset of top-down semantic maps, and integrate it into a modular framework to perform ObjectGoal navigation. Experiments on Gibson and Matterport3D demonstrate that our method achieves the state-of-the-art for ObjectGoal navigation while incurring up to 1,600x less computational cost for training. Code and pre-trained models are available: https://vision.cs.utexas.edu/projects/poni/
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
MVP-Nav: Multi-layer Value Map Planner Navigator
MVP-Nav reconstructs explicit 3D physical occupancy from monocular RGB using foundation models and integrates it with semantic priorities via a Multi-layer Value Map for grounded planning in zero-shot object navigation.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.