TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation

· 2018 · cs.CV · arXiv 1801.05746

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Pixel-wise image segmentation is demanding task in computer vision. Classical U-Net architectures composed of encoders and decoders are very popular for segmentation of medical images, satellite images etc. Typically, neural network initialized with weights from a network pre-trained on a large data set like ImageNet shows better performance than those trained from scratch on a small dataset. In some practical applications, particularly in medicine and traffic safety, the accuracy of the models is of utmost importance. In this paper, we demonstrate how the U-Net type architecture can be improved by the use of the pre-trained encoder. Our code and corresponding pre-trained weights are publicly available at https://github.com/ternaus/TernausNet. We compare three weight initialization schemes: LeCun uniform, the encoder with weights from VGG11 and full network trained on the Carvana dataset. This network architecture was a part of the winning solution (1st out of 735) in the Kaggle: Carvana Image Masking Challenge.

representative citing papers

InterCMDM: Block-Causal Diffusion for Autoregressive Human Interaction Generation

cs.CV · 2026-07-02 · unverdicted · novelty 5.0

InterCMDM proposes a block-causal latent diffusion framework with dual-stream causal transformers and multi-task attention masks for autoregressive text-conditioned two-person interaction generation and reports SOTA results on InterHuman and Inter-X.

Deep-Learning for Tidemark Segmentation in Human Osteochondral Tissues Imaged with Micro-computed Tomography

eess.IV · 2019-07-11 · accept · novelty 5.0

U-Net with combined binary cross-entropy and soft Jaccard loss segments the tidemark in PTA-stained micro-CT images, achieving IoU scores of 0.59–0.86 at increasing padded distances on cross-validation of 35 samples.

Generating large labeled data sets for laparoscopic image processing tasks using unpaired image-to-image translation

cs.LG · 2019-07-05 · unverdicted · novelty 5.0

Generates large labeled realistic laparoscopic image datasets from simulations using extended unpaired translation and demonstrates use for liver segmentation achieving Dice scores up to 0.89 without any real labeled data.

citing papers explorer

Showing 3 of 3 citing papers.

InterCMDM: Block-Causal Diffusion for Autoregressive Human Interaction Generation cs.CV · 2026-07-02 · unverdicted · none · ref 51 · internal anchor
InterCMDM proposes a block-causal latent diffusion framework with dual-stream causal transformers and multi-task attention masks for autoregressive text-conditioned two-person interaction generation and reports SOTA results on InterHuman and Inter-X.
Deep-Learning for Tidemark Segmentation in Human Osteochondral Tissues Imaged with Micro-computed Tomography eess.IV · 2019-07-11 · accept · none · ref 5 · internal anchor
U-Net with combined binary cross-entropy and soft Jaccard loss segments the tidemark in PTA-stained micro-CT images, achieving IoU scores of 0.59–0.86 at increasing padded distances on cross-validation of 35 samples.
Generating large labeled data sets for laparoscopic image processing tasks using unpaired image-to-image translation cs.LG · 2019-07-05 · unverdicted · none · ref 7 · internal anchor
Generates large labeled realistic laparoscopic image datasets from simulations using extended unpaired translation and demonstrates use for liver segmentation achieving Dice scores up to 0.89 without any real labeled data.

TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation

fields

years

verdicts

representative citing papers

citing papers explorer