pith. machine review for the scientific record. sign in

arxiv: 2511.23332 · v3 · submitted 2025-11-28 · 💻 cs.CV

Recognition: unknown

UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes

Shuo Ni , Di Wang , He Chen , Haonan Guo , Ning Zhang , Jing Zhang

Authors on Pith no claims yet
classification 💻 cs.CV
keywords segmentationunigeoseginstruction-drivenacrossdatasetsdiversegeneralizationgeoseg-1m
0
0 comments X
read the original abstract

Instruction-driven segmentation in remote sensing generates masks from guidance, offering great potential for accessible and generalizable applications. However, existing methods suffer from fragmented task formulations and limited instruction data, hindering effective understanding and generalization. To address these issues, we introduce GeoSeg-1M, the first million-scale dataset for remote sensing instruction-driven segmentation, constructed via an automatic mask filtering and instruction generation pipeline that synthesizes referring, interactive, and reasoning segmentation instructions from multiple public datasets. GeoSeg-1M contains 590K images, 117 categories, and 1.1M image-mask-instruction triplets. Building upon this foundation, we further curate GeoSeg-Bench, a challenging benchmark designed to evaluate contextual understanding and reasoning capabilities across diverse instruction-driven tasks and complex geospatial scenes. Furthermore, we present UniGeoSeg, a unified framework that serves as a strong baseline, incorporating task-aware text enhancement, latent knowledge memory, and a progressive training strategy to facilitate multi-task learning. Extensive experiments demonstrate the state-of-the-art performance of UniGeoSeg across GeoSeg-Bench and diverse public benchmarks, while exhibiting strong zero-shot generalization. Datasets and source code were released at https://github.com/MiliLab/UniGeoSeg.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. UHR-Micro: Diagnosing and Mitigating the Resolution Illusion in Earth Observation VLMs

    cs.CV 2026-05 unverdicted novelty 7.0

    VLMs show a resolution illusion on UHR Earth observation imagery where higher resolution does not improve micro-target perception; UHR-Micro benchmark and MAP-Agent address this via evidence-centered active inspection.

  2. Think in Latent Thoughts: A New Paradigm for Gloss-Free Sign Language Translation

    cs.CV 2026-04 unverdicted novelty 6.0

    A new SLT framework uses latent thoughts as a middle reasoning layer and plan-then-ground decoding to improve coherence and faithfulness in gloss-free sign language translation.