Image segmentation / mark a single object type within an image / cheat sheet

Target audience: Data scientists and developers

Problem formulation

Use this cheat sheet:
If your input data consists of a set of images (.jpg or .png), wherein each image there is one or several objects of one specific type.

The selected model should be able to mark, pixel by pixel, where the object is in the image.

Example use cases

Note
Disclaimer
Please note that data sets, models and other content, including open source software, (collectively referred to as "Content") provided and/or suggested by Peltarion for use in the Platform, may be subject to separate third party terms of use or license terms. You are solely responsible for complying with the applicable terms. Peltarion makes no representations or warranties about Content and specifically disclaim all responsibility for any liability, loss, or risk, which is incurred as a consequence, directly or indirectly, of the use or application of any of the Content.

ISIC skin lesion mask image PA1

Marking the location of a skin lesion in an image.
(e.g., the ISIC challenge)

Peltarion faces and mask PA1

Marking the location of all human faces in an image.

Heart x ray PA1

Marking the position of a specific organ, e.g., the heart in an X-ray image.

Data preparation

Data input requirements

You will need to prepare a zip file containing:

  • One folder with all input images

  • One folder with all mask images that represent the true location of the object in the input image; this location is represented by a grayscale image (.jpg or .png), where 255 represents object pixels, and 0 represents background pixels

  • A corresponding index.csv

Structure of index.csv

You need to structure the data so that on each row an input image is mapped to its corresponding mask image.

image mask

image_1.jpg

mask_1.jpg

image_2.jpg

mask_2.jpg

Use same sized images

All of your input images and mask images need to be the same size when uploading to the Platform. If they are different sizes, you will need to resize them before creating the zip file, e.g., by using the Python Imaging Library (PIL). Example:

from PIL import Image

im = Image.open('image1.jpg'))
im = im.resize((64,64))

By far the easiest and best way to resize images and masks is to use the Sidekick library. It helps you to resize and crop the images uniformly and also to create a complete dataset bundle, ready to be uploaded to Platform. An example of how to use it is available in the preprocessing notebook for the Skin lesion segmentation tutorial.

Note
Make sure that you resize the input image and mask image in the same way, and that the resizing method used does not introduce any grayscale in the mask images, it should consist only of the values 0 and 255.

Image size should be divisible by 32 for Tiramisu and 16 for U-net

Models for image segmentation include a down-sampling path, which shrinks the image by a specific factor, and an inverse up-sampling path, which expands the image back up to the original size.

For Tiramisu, the image size should be divisible by 32. The image can be rectangular, e.g., 64x128 or 256x128.

For U-net, the image size should be divisible by 16. Recommended image sizes are 192x192, 224x224 or 256x256.

Change preprocessing in the Datasets view

Once uploaded to the platform, choose Preprocessing as No preprocessing for both the input image and the mask image column.

Features segmentation image classification PA1

Modeling

Select U-net or Tiramisu. Both of these snippets are good for image segmentation.

Changes in the Modeling view

On the last 2D Convolution block, make sure the number of filters is 1, and the activation is Sigmoid. Use the sigmoid activation since the prediction for each pixel should be independent of the other pixels.

On the Target block set the loss function to Binary crossentropy.

Last blocks Mark a single object type within an image PA1

Evaluation

The Evaluation view will display a confusion matrix where 0 means background, and 1 means object.

The count represents the number of pixels that have been classified accordingly.

Note
This may be a sampled count if your dataset is large, so the total sum over the confusion matrix may not equal the total number of pixels in your validation dataset.