pith. machine review for the scientific record. sign in

arxiv: 1612.01105 · v2 · submitted 2016-12-04 · 💻 cs.CV

Recognition: unknown

Pyramid Scene Parsing Network

Authors on Pith no claims yet
classification 💻 cs.CV
keywords parsingscenepspnetpyramidaccuracybenchmarkcityscapescontext
0
0 comments X
read the original abstract

Scene parsing is challenging for unrestricted open vocabulary and diverse scenes. In this paper, we exploit the capability of global context information by different-region-based context aggregation through our pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet). Our global prior representation is effective to produce good quality results on the scene parsing task, while PSPNet provides a superior framework for pixel-level prediction tasks. The proposed approach achieves state-of-the-art performance on various datasets. It came first in ImageNet scene parsing challenge 2016, PASCAL VOC 2012 benchmark and Cityscapes benchmark. A single PSPNet yields new record of mIoU accuracy 85.4% on PASCAL VOC 2012 and accuracy 80.2% on Cityscapes.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. InCoM: Intent-Driven Perception and Structured Coordination for Mobile Manipulation

    cs.RO 2026-02 unverdicted novelty 6.0

    InCoM achieves 23-28% higher success rates in mobile manipulation tasks by inferring motion intent for adaptive perception and decoupling base-arm action generation.

  2. Rethinking Atrous Convolution for Semantic Image Segmentation

    cs.CV 2017-06 unverdicted novelty 6.0

    DeepLabv3 improves semantic segmentation by capturing multi-scale context with cascaded or parallel atrous convolutions and adding global context to ASPP, achieving better results on PASCAL VOC 2012 without DenseCRF p...