A new diagnostic benchmark decomposes LLM spatial navigation into three cognitive scales and shows that cross-scale aggregation, not single-level deficits, causes failure beyond small mazes.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
physics.soc-ph 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Lost in Aggregation: A Multi-Scale Diagnostic Benchmark for LLM Spatial Navigation
A new diagnostic benchmark decomposes LLM spatial navigation into three cognitive scales and shows that cross-scale aggregation, not single-level deficits, causes failure beyond small mazes.