ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

Adam Paszke , Abhishek Chaurasia , Sangpil Kim , Eugenio Culurciello

Authors on Pith no claims yet

classification 💻 cs.CV

keywords enetnetworkneuralarchitecturedeeptimesaccuracyexisting

read the original abstract

The ability to perform pixel-wise semantic segmentation in real-time is of paramount importance in mobile applications. Recent deep neural networks aimed at this task have the disadvantage of requiring a large number of floating point operations and have long run-times that hinder their usability. In this paper, we propose a novel deep neural network architecture named ENet (efficient neural network), created specifically for tasks requiring low latency operation. ENet is up to 18$\times$ faster, requires 75$\times$ less FLOPs, has 79$\times$ less parameters, and provides similar or better accuracy to existing models. We have tested it on CamVid, Cityscapes and SUN datasets and report on comparisons with existing state-of-the-art methods, and the trade-offs between accuracy and processing time of a network. We present performance measurements of the proposed architecture on embedded systems and suggest possible software improvements that could make ENet even faster.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Bidirectional Cross-Attention Fusion of High-Res RGB and Low-Res HSI for Multimodal Automated Waste Sorting
cs.CV 2026-03 accept novelty 7.0

BCAF fuses native-grid high-res RGB and low-res HSI via bidirectional cross-attention in adapted Swin Transformers to reach state-of-the-art mIoU on SpectralWaste and a new industrial dataset while running at real-tim...
Design and Behavior of Sparse Mixture-of-Experts Layers in CNN-based Semantic Segmentation
cs.CV 2026-04 unverdicted novelty 6.0

Patch-wise sparse MoE layers in CNNs for semantic segmentation yield architecture-dependent gains up to 3.9 mIoU on Cityscapes and BDD100K with low overhead, but show strong design sensitivity.
Geometric Flood Depth Estimation: Fusing Transformer-Based Segmentation with Digital Elevation Models
cs.CV 2026-05 unverdicted novelty 5.0

A pipeline uses Mask2Former flood masks and DEMs to compute a single water surface elevation then derives local depths under hydrostatic equilibrium.
DA-SegFormer: Damage-Aware Semantic Segmentation for Fine-Grained Disaster Assessment
cs.CV 2026-05 unverdicted novelty 4.0

DA-SegFormer reaches 74.61% mIoU on RescueNet by adding class-aware sampling and OHEM-Dice loss to SegFormer, delivering double-digit gains on minor and major damage classes.
FoR-Net: Learning to Focus on Hard Regions for Efficient Semantic Segmentation
cs.CV 2026-05 unverdicted novelty 4.0

FoR-Net improves efficiency in semantic segmentation by focusing on hard regions with a learned selector and multi-scale convolutions, achieving competitive results on Cityscapes.
From Virtual Environments to Real-World Trials: Emerging Trends in Autonomous Driving
cs.AI 2026-03 unverdicted novelty 4.0

A survey organizes synthetic data use, digital twin simulation, and domain adaptation techniques for autonomous driving while identifying open challenges like Sim2Real transfer.