SenseBench is the first physics-based benchmark with 10K+ instances and dual protocols to evaluate VLMs on remote sensing low-level perception and diagnostic description, revealing domain bias and specific failure modes.
Vision-language models in remote sensing: Current progress and future trends.IEEE Geoscience and Remote Sensing Magazine, 12(2):32–66
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
SkyNative introduces an encoder-free architecture using raw patch tokens and modality-specific parameters in a unified autoregressive model to improve image-grounded reasoning in remote sensing vision-language tasks.
citing papers explorer
-
SenseBench: A Benchmark for Remote Sensing Low-Level Visual Perception and Description in Large Vision-Language Models
SenseBench is the first physics-based benchmark with 10K+ instances and dual protocols to evaluate VLMs on remote sensing low-level perception and diagnostic description, revealing domain bias and specific failure modes.
-
SkyNative: A Native Multimodal Framework for Remote Sensing Visual Evidence Reasoning
SkyNative introduces an encoder-free architecture using raw patch tokens and modality-specific parameters in a unified autoregressive model to improve image-grounded reasoning in remote sensing vision-language tasks.