We recommend using the U-net snippet for image segmentation and reconstruction.
The U-net was introduced as a model for segmentation of biomedical images. It uses a convolutional autoencoder structure combined with skip connections to combine low-level and high-level information.
The U-Net has a horseshoe shape with a downsampling and an upsampling path. Skip connections feed information from the downsampling path to the upsampling path.
The purpose of the downsampling path is to capture the context of the input image in order to be able to do segmentation. This coarse contextual information is transferred to the upsampling path by means of skip connections.
The purpose of the upsampling path is to enable precise localization combined with contextual information from the downsampling path.
The downsampling path follows the typical architecture of a convolutional network. It consists of the repeated application of:
Two 2D Convolution blocks with ReLU activation
A 2D Max pooling with stride 2 for downsampling. At each downsampling step, we double the number of filters.
Every step in the upsampling path consists of:
A 2D upsampling of the activation map
A 2D Convolution block that halves the number of filters
A Concatenate block that concatenates with the corresponding activation map from the downsampling path
Two 2D Convolutions with ReLU activation. The cropping is necessary due to the loss of border pixels in every convolution.
At the final block, a 1x1 convolution is used to map each feature vector to the desired number of classes.
How to use the U-net snippet
To add a U-net snippet open the Snippet section in the Inspector and click U-net.
The input images of the U-net snippet must have a size which is a multiple of 16, e.g. 32x48, 64x128, 64x32, etc. This is due to the presence of the 2D Max pooling blocks.
In addition, the size of the input must be at least 16x16 pixels, although we recommend to use 32x32 images or larger if possible.
Group of blocks
The blocks included in the U-net are grouped in functional blocks. You can expand each group to inspect what the are included. This makes the presentation of architectures much more understandable and easier to modify and manipulate.
You can still change the parameters of each individual block.
U-net for image segmentation
If your target consists of more than 2 classes, set the activation in the last 2D Convolution block to Softmax and set the loss function in the Target block to Categorical crossentropy.
Example: Satellite image segmentation where each pixel is labeled with a class. In this image, pixels are labeled as built-up (red), farmland (green) or unknown (grey).
Binary class segmentation
If your target consists of 2 classes encoded as an image of 0:s and 1:s, set the activation in the last 2D Convolution block to Sigmoid and set the loss function in the Target block to Binary crossentropy.
Example: The target is a black (0) and white (1) image mask. In this image, HeLa cells on glass recorded with microscopy and the generated segmentation mask (image from original paper).
U-net for image reconstruction
If you use U-net for image reconstruction use a loss function tailored to how you normalize your targets.
Olaf Ronneberger, Philipp Fischer, Thomas Brox: U-Net: Convolutional Networks for Biomedical Image Segmentation, 2015.