ICNet for Real-Time Semantic Segmentation on High-Resolution Images
read the original abstract
We focus on the challenging task of real-time semantic segmentation in this paper. It finds many practical applications and yet is with fundamental difficulty of reducing a large portion of computation for pixel-wise label inference. We propose an image cascade network (ICNet) that incorporates multi-resolution branches under proper label guidance to address this challenge. We provide in-depth analysis of our framework and introduce the cascade feature fusion unit to quickly achieve high-quality segmentation. Our system yields real-time inference on a single GPU card with decent quality results evaluated on challenging datasets like Cityscapes, CamVid and COCO-Stuff.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
Separable Convolutional LSTMs for Faster Video Segmentation
Separable convLSTMs cut parameters and FLOPs in video segmentation, delivering up to 15% faster GPU inference with similar or slightly lower accuracy.
-
Efficient Segmentation: Learning Downsampling Near Semantic Boundaries
Learned adaptive downsampling for semantic segmentation that prioritizes locations near semantic boundaries to improve the accuracy-efficiency trade-off over uniform sampling.
-
ESNet: An Efficient Symmetric Network for Real-time Semantic Segmentation
ESNet is a lightweight symmetric CNN using factorized residual units and parallel dilated convolutions that reaches over 62 FPS semantic segmentation on Cityscapes with 1.6M parameters.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.