pith. sign in

arxiv: 1704.08545 · v2 · pith:AX2K4ZNUnew · submitted 2017-04-27 · 💻 cs.CV

ICNet for Real-Time Semantic Segmentation on High-Resolution Images

classification 💻 cs.CV
keywords real-timesegmentationcascadechallengingicnetinferencelabelsemantic
0
0 comments X
read the original abstract

We focus on the challenging task of real-time semantic segmentation in this paper. It finds many practical applications and yet is with fundamental difficulty of reducing a large portion of computation for pixel-wise label inference. We propose an image cascade network (ICNet) that incorporates multi-resolution branches under proper label guidance to address this challenge. We provide in-depth analysis of our framework and introduce the cascade feature fusion unit to quickly achieve high-quality segmentation. Our system yields real-time inference on a single GPU card with decent quality results evaluated on challenging datasets like Cityscapes, CamVid and COCO-Stuff.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Separable Convolutional LSTMs for Faster Video Segmentation

    cs.CV 2019-07 unverdicted novelty 6.0

    Separable convLSTMs cut parameters and FLOPs in video segmentation, delivering up to 15% faster GPU inference with similar or slightly lower accuracy.

  2. Efficient Segmentation: Learning Downsampling Near Semantic Boundaries

    cs.CV 2019-07 unverdicted novelty 4.0

    Learned adaptive downsampling for semantic segmentation that prioritizes locations near semantic boundaries to improve the accuracy-efficiency trade-off over uniform sampling.

  3. ESNet: An Efficient Symmetric Network for Real-time Semantic Segmentation

    cs.CV 2019-06 unverdicted novelty 4.0

    ESNet is a lightweight symmetric CNN using factorized residual units and parallel dilated convolutions that reaches over 62 FPS semantic segmentation on Cityscapes with 1.6M parameters.