U-net snippet

We recommend using the U-net snippet for image segmentation and reconstruction.

The U-net was introduced as a model for segmentation of biomedical images. It uses a convolutional autoencoder structure combined with skip connections to combine low-level and high-level information.

Raw image and U-net generated segmentation mask (illustration from original paper)
Figure 1. Raw image and U-net generated segmentation mask (illustration from original paper)

U-net architecture

The U-Net has a horseshoe shape with a downsampling and an upsampling path. Skip connections feed information from the downsampling path to the upsampling path.

The purpose of the downsampling path is to capture the context of the input image in order to be able to do segmentation. This coarse contextual information is transferred to the upsampling path by means of skip connections.

The purpose of the upsampling path is to enable precise localization combined with contextual information from the downsampling path.

U-net architecture. Each grey box corresponds to the resulting activation map from a 2D Convolution. Each red box corresponds to the resulting activation map from a 2D Max pooling block. White boxes represent activation maps forwarded from the downsampling path that is concatenated with the corresponding activation map (blue) in the upsampling path.
Figure 2. U-net architecture. Each grey box corresponds to the resulting activation map from a 2D Convolution. Each red box corresponds to the resulting activation map from a 2D Max pooling block. White boxes represent activation maps forwarded from the downsampling path that is concatenated with the corresponding activation map (blue) in the upsampling path.

The downsampling path follows the typical architecture of a convolutional network. It consists of the repeated application of:

  • Two 2D Convolution blocks with ReLU activation

  • A 2D Max pooling with stride 2 for downsampling. At each downsampling step, we double the number of filters.

Every step in the upsampling path consists of:

  • A 2D upsampling of the activation map

  • A 2D Convolution block that halves the number of filters

  • A Concatenate block that concatenates with the corresponding activation map from the downsampling path

  • Two 2D Convolutions with ReLU activation. The cropping is necessary due to the loss of border pixels in every convolution.

At the final block, a 1x1 convolution is used to map each feature vector to the desired number of classes.

How to use the U-net snippet

To add a U-net snippet open the Snippet section in the Inspector and click U-net.

The size of the images in the dataset should be 256x256 pixels and larger.

Group of blocks

The blocks included in the U-net are grouped in functional blocks. You can expand each group to inspect what the are included. This makes the presentation of architectures much more understandable and easier to modify and manipulate.

You can still change the parameters of each individual block.

Name of the group of blocks

  1. Residual Branch - Identity

  2. Residual Group - Identity

  3. Residual Group - Projection

U-net for image segmentation

Multi-class segmentation

If your target consists of more than 2 classes, set the activation in the last 2D Convolution block to Softmax and set the loss function in the Target block to Categorical crossentropy.
Example: Satellite image segmentation where each pixel is labeled with a class. In this image, pixels are labeled as built-up (red), farmland (green) or unknown (grey).

Multi-class segmentation on a satellite image

Binary class segmentation

If your target consists of 2 classes encoded as an image of 0:s and 1:s, set the activation in the last 2D Convolution block to Sigmoid and set the loss function in the Target block to Binary crossentropy.
Example: The target is a black (0) and white (1) image mask. In this image, HeLa cells on glass recorded with microscopy and the generated segmentation mask (image from original paper).

U-net generated binary segmentation mask

U-net for image reconstruction

If you use U-net for image reconstruction use a loss function tailored to how you normalize your targets.

Reference

Olaf Ronneberger, Philipp Fischer, Thomas Brox: U-Net: Convolutional Networks for Biomedical Image Segmentation, 2015.