Introduces the SMART-HC-VQA dataset with 65k single-image and 2.3M temporal VQA examples plus an adapted LLaVA-NeXT MLLM framework for geospatial-temporal sensemaking of remote sensing construction activity.
VLPRSDet: A vision-language pretrained model for remote sensing object detection
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.IV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Geospatial-Temporal Sensemaking of Remote Sensing Activity Detections with Multimodal Large Language Model
Introduces the SMART-HC-VQA dataset with 65k single-image and 2.3M temporal VQA examples plus an adapted LLaVA-NeXT MLLM framework for geospatial-temporal sensemaking of remote sensing construction activity.