SemDINO proposes a dual-branch encoder with DINOv3 features, multi-scale temporal interaction, and enhancement modules for improved semantic change detection in remote sensing.
ChangeCLIP: Remote sensing change detection with multimodal vision-language representation learning,
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Lightweight multimodal projector alignment transfers RGB VLMs to thermal drone imagery, achieving F1 scores of 0.915-0.968 for deer, rhino, and elephant recognition plus high enumeration accuracy and habitat context interpretation on a real drone dataset.
citing papers explorer
-
SemDINO: A DINOv3-Driven Network for Cross-Temporal Semantic Alignment in Change Detection
SemDINO proposes a dual-branch encoder with DINOv3 features, multi-scale temporal interaction, and enhancement modules for improved semantic change detection in remote sensing.
-
Lightweight Multimodal Adaptation of Vision Language Models for Species Recognition and Habitat Context Interpretation in Drone Thermal Imagery
Lightweight multimodal projector alignment transfers RGB VLMs to thermal drone imagery, achieving F1 scores of 0.915-0.968 for deer, rhino, and elephant recognition plus high enumeration accuracy and habitat context interpretation on a real drone dataset.