CASWiT reaches 66.37% mIoU on FLAIR-HUB and 49.2% on URUR by injecting low-resolution context via stage-wise cross-attention in a dual-branch Swin architecture with SimMIM-style pretraining.
Simmim: A simple framework for masked image modeling
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Context-Aware Semantic Segmentation via Stage-Wise Attention
CASWiT reaches 66.37% mIoU on FLAIR-HUB and 49.2% on URUR by injecting low-resolution context via stage-wise cross-attention in a dual-branch Swin architecture with SimMIM-style pretraining.