PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning

Devendra Singh Chaplot; Jitendra Malik; Kristen Grauman; Santhosh Kumar Ramakrishnan; Ziad Al-Halah

arxiv: 2201.10029 · v2 · pith:34MDAPUMnew · submitted 2022-01-25 · 💻 cs.CV · cs.AI

PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning

Santhosh Kumar Ramakrishnan , Devendra Singh Chaplot , Ziad Al-Halah , Jitendra Malik , Kristen Grauman This is my paper

classification 💻 cs.CV cs.AI

keywords learningnavigationobjectgoalpotentialfunctionslookponicomputational

0 comments

read the original abstract

State-of-the-art approaches to ObjectGoal navigation rely on reinforcement learning and typically require significant computational resources and time for learning. We propose Potential functions for ObjectGoal Navigation with Interaction-free learning (PONI), a modular approach that disentangles the skills of `where to look?' for an object and `how to navigate to (x, y)?'. Our key insight is that `where to look?' can be treated purely as a perception problem, and learned without environment interactions. To address this, we propose a network that predicts two complementary potential functions conditioned on a semantic map and uses them to decide where to look for an unseen object. We train the potential function network using supervised learning on a passive dataset of top-down semantic maps, and integrate it into a modular framework to perform ObjectGoal navigation. Experiments on Gibson and Matterport3D demonstrate that our method achieves the state-of-the-art for ObjectGoal navigation while incurring up to 1,600x less computational cost for training. Code and pre-trained models are available: https://vision.cs.utexas.edu/projects/poni/

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

MVP-Nav: Multi-layer Value Map Planner Navigator
cs.RO 2026-06 unverdicted novelty 5.0

MVP-Nav reconstructs explicit 3D physical occupancy from monocular RGB using foundation models and integrates it with semantic priorities via a Multi-layer Value Map for grounded planning in zero-shot object navigation.