Vision-language navigation: a survey and taxonomy

· 2024

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Generalizable Audio-Visual Navigation via Binaural Difference Attention and Action Transition Prediction

cs.SD · 2026-04-06 · unverdicted · novelty 7.0

BDATP enhances generalization in audio-visual navigation by explicitly modeling interaural differences and using auxiliary action prediction, achieving up to 21.6 percentage point gains in success rate on unheard sounds in Replica dataset.

Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning

cs.CV · 2025-12-09 · unverdicted · novelty 6.0

A monocular RGB-only aerial VLN framework outperforms baselines via prompt-guided multi-task learning, keyframe selection, and label reweighting on AerialVLN and OpenFly benchmarks.

citing papers explorer

Showing 2 of 2 citing papers.

Generalizable Audio-Visual Navigation via Binaural Difference Attention and Action Transition Prediction cs.SD · 2026-04-06 · unverdicted · none · ref 7
BDATP enhances generalization in audio-visual navigation by explicitly modeling interaural differences and using auxiliary action prediction, achieving up to 21.6 percentage point gains in success rate on unheard sounds in Replica dataset.
Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning cs.CV · 2025-12-09 · unverdicted · none · ref 20
A monocular RGB-only aerial VLN framework outperforms baselines via prompt-guided multi-task learning, keyframe selection, and label reweighting on AerialVLN and OpenFly benchmarks.

Vision-language navigation: a survey and taxonomy

fields

years

verdicts

representative citing papers

citing papers explorer