The paper introduces a Trajectory Waypoint paradigm with a TSDF-guided diffusion policy and trajectory-enhanced navigator that achieves better performance on VLN-CE benchmarks by ensuring waypoint reachability and planning-execution consistency.
Mapgpt: Map-guided prompting for unified vision-and-language navigation
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 7roles
background 1polarities
background 1representative citing papers
AeroBridge-TTA achieves +22 pt average gains on out-of-distribution UAV dynamics mismatches by updating a latent state online from observed transitions in a language-conditioned policy.
DynFly bridges high-level UAV navigation reasoning to continuous motion via B-spline trajectory generation with flow matching and UAV-specific dynamic supervision, yielding metric gains on the OpenUAV benchmark.
Experiments reveal that topological cues robustly support LLM navigation planning while incorrect semantic cues derail it, with linguistic format effects varying by model size and compression.
NaVid, a video-based VLM trained on 510k navigation and 763k web samples, achieves SOTA VLN performance using only monocular RGB video for next-step action planning in sim and real environments.
RePlan-Bot achieves state-of-the-art results on the ALFRED benchmark for embodied instruction following by integrating LLM-based auditing, commonsense map search, and ViT action correction.
Proposes cost-aware question selection for ambiguous object navigation via information-gain analysis on corpora, a cost-penalizing benchmark, and a zero-shot MLLM agent.
citing papers explorer
-
The Sword, Shield, and Achilles' Heel: Characterizing the Linguistic Inductive Bias of Large Language Models for Spatial Reasoning in Navigation Planning
Experiments reveal that topological cues robustly support LLM navigation planning while incorrect semantic cues derail it, with linguistic format effects varying by model size and compression.