SACF discretizes target direction and distance from audio-visual cues then applies conditioned fusion to improve navigation efficiency and generalization to unheard sounds.
Avlen: Audio-visual- language embodied navigation in 3d environments
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
fields
cs.SD 2years
2026 2verdicts
UNVERDICTED 2roles
background 2polarities
background 2representative citing papers
Audio Spatially-Guided Fusion improves generalization in audio-visual navigation on unheard sound sources by extracting spatial audio features and adaptively fusing them with visual data.
citing papers explorer
-
Spatial-Aware Conditioned Fusion for Audio-Visual Navigation
SACF discretizes target direction and distance from audio-visual cues then applies conditioned fusion to improve navigation efficiency and generalization to unheard sounds.
-
Audio Spatially-Guided Fusion for Audio-Visual Navigation
Audio Spatially-Guided Fusion improves generalization in audio-visual navigation on unheard sound sources by extracting spatial audio features and adaptively fusing them with visual data.