We recommend using the U-net snippet for image segmentation and reconstruction.
The U-net was introduced as a model for segmentation of biomedical images. It uses a convolutional autoencoder structure combined with skip connections to combine low-level and high-level information.
The U-Net has a horseshoe shape with a downsampling and an upsampling path. Skip connections feed information from the downsampling path to the upsampling path.
The purpose of the downsampling path is to capture the context of the input image in order to be able to do segmentation. This coarse contextual information is transferred to the upsampling path by means of skip connections.
The purpose of the upsampling path is to enable precise localization combined with contextual information from the downsampling path.
The downsampling path follows the typical architecture of a convolutional network. It consists of the repeated application of:
Two 2D Convolution blocks with ReLU activation
A 2D Max pooling with stride 2 for downsampling. At each downsampling step, we double the number of filters.
Every step in the upsampling path consists of:
A 2D upsampling of the activation map
A 2D Convolution block that halves the number of filters
A Concatenate block that concatenates with the corresponding activation map from the downsampling path
Two 2D Convolutions with ReLU activation. The cropping is necessary due to the loss of border pixels in every convolution.
At the final block, a 1x1 convolution is used to map each feature vector to the desired number of classes.
To add a U-net snippet open the Snippet section in the Inspector and click U-net.
The size of the images in the dataset should be 256x256 pixels and larger.
If your target consists of more than 2 classes, set the activation in the last 2D Convolution block to Softmax and set the loss function in the Target block to Categorical crossentropy.
Example: Satellite image segmentation where each pixel is labeled with a class. In this image, pixels are labeled as built-up (red), farmland (green) or unknown (grey).
If your target consists of 2 classes encoded as an image of 0:s and 1:s, set the activation in the last 2D Convolution block to Sigmoid and set the loss function in the Target block to Binary crossentropy.
Example: The target is a black (0) and white (1) image mask. In this image, HeLa cells on glass recorded with microscopy and the generated segmentation mask (image from original paper).
If you use U-net for image reconstruction use a loss function tailored to how you normalize your targets.
Olaf Ronneberger, Philipp Fischer, Thomas Brox: U-Net: Convolutional Networks for Biomedical Image Segmentation, 2015.