AirGroundBench is a new diagnostic benchmark exposing that MLLMs handle basic spatial perception but struggle with cross-view alignment, transformation reasoning, and embodied navigation under heterogeneous air-ground views.
MM-UA VBench: How well do multimodal large language models see, think, and plan in low-altitude uav scenarios?
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2representative citing papers
SpatialUAV is a new real-world benchmark dataset and evaluation suite exposing large gaps between vision-language models and human performance on spatial tasks for low-altitude UAVs.
citing papers explorer
-
AirGroundBench: Probing Spatial Intelligence in Multimodal Large Models under Heterogeneous Multi-View Embodied Collaboration
AirGroundBench is a new diagnostic benchmark exposing that MLLMs handle basic spatial perception but struggle with cross-view alignment, transformation reasoning, and embodied navigation under heterogeneous air-ground views.
-
SpatialUAV: Benchmarking Spatial Intelligence for Low-Altitude UAV Perception, Collaboration, and Motion
SpatialUAV is a new real-world benchmark dataset and evaluation suite exposing large gaps between vision-language models and human performance on spatial tasks for low-altitude UAVs.