Pyramid Attention Network for Semantic Segmentation
pith:YG7XDCCI Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{YG7XDCCI}
Prints a linked pith:YG7XDCCI badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
read the original abstract
A Pyramid Attention Network(PAN) is proposed to exploit the impact of global contextual information in semantic segmentation. Different from most existing works, we combine attention mechanism and spatial pyramid to extract precise dense features for pixel labeling instead of complicated dilated convolution and artificially designed decoder networks. Specifically, we introduce a Feature Pyramid Attention module to perform spatial pyramid attention structure on high-level output and combining global pooling to learn a better feature representation, and a Global Attention Upsample module on each decoder layer to provide global context as a guidance of low-level features to select category localization details. The proposed approach achieves state-of-the-art performance on PASCAL VOC 2012 and Cityscapes benchmarks with a new record of mIoU accuracy 84.0% on PASCAL VOC 2012, while training without COCO dataset.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Quantum Feature Pyramid Gating for Seismic Image Segmentation
A 4-qubit quantum feature pyramid gating architecture raises mean IoU from 0.8404 to 0.9389 over classical addition in controlled ablations on the TGS salt segmentation dataset.
-
Grounding Synthetic Data Generation With Vision and Language Models
A vision-language grounded framework generates and evaluates synthetic remote sensing data, releasing ARAS400k where augmented training outperforms real-data baselines for segmentation and captioning.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.