pith. sign in

arxiv: 2201.10029 · v2 · pith:34MDAPUMnew · submitted 2022-01-25 · 💻 cs.CV · cs.AI

PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning

classification 💻 cs.CV cs.AI
keywords learningnavigationobjectgoalpotentialfunctionslookponicomputational
0
0 comments X
read the original abstract

State-of-the-art approaches to ObjectGoal navigation rely on reinforcement learning and typically require significant computational resources and time for learning. We propose Potential functions for ObjectGoal Navigation with Interaction-free learning (PONI), a modular approach that disentangles the skills of `where to look?' for an object and `how to navigate to (x, y)?'. Our key insight is that `where to look?' can be treated purely as a perception problem, and learned without environment interactions. To address this, we propose a network that predicts two complementary potential functions conditioned on a semantic map and uses them to decide where to look for an unseen object. We train the potential function network using supervised learning on a passive dataset of top-down semantic maps, and integrate it into a modular framework to perform ObjectGoal navigation. Experiments on Gibson and Matterport3D demonstrate that our method achieves the state-of-the-art for ObjectGoal navigation while incurring up to 1,600x less computational cost for training. Code and pre-trained models are available: https://vision.cs.utexas.edu/projects/poni/

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. MVP-Nav: Multi-layer Value Map Planner Navigator

    cs.RO 2026-06 unverdicted novelty 5.0

    MVP-Nav reconstructs explicit 3D physical occupancy from monocular RGB using foundation models and integrates it with semantic priorities via a Multi-layer Value Map for grounded planning in zero-shot object navigation.