Image segmentation / mark a single object type within an image / cheat sheet
Use this cheat sheet
If your input data consists of a set of images (.jpg or .png), wherein each image there is one or several objects of one specific type.
The selected model should be able to mark, pixel by pixel, where the object is in the image.
Example use cases
Marking the location of a skin lesion in an image.
Marking the location of all human faces in an image.
Marking the position of a specific organ, e.g., the heart in an X-ray image.
Data input requirements
You will need to prepare a zip file containing:
One folder with all input images
One folder with all mask images that represent the true location of the object in the input image; this location is represented by a grayscale image (.jpg or .png), where 255 represents object pixels, and 0 represents background pixels
A corresponding index.csv
Structure of index.csv
You need to structure the data so that on each row an input image is mapped to its corresponding mask image.
Use same sized images
All of your input images and mask images need to be the same size when uploading to the Platform. If they are different sizes, you will need to resize them before creating the zip file, e.g., by using the Python Imaging Library (PIL). Example:
from PIL import Image im = Image.open('image1.jpg')) im = im.resize((64,64))
By far the easiest and best way to resize images and masks is to use the Sidekick library. It helps you to resize and crop the images uniformly and also to create a complete dataset bundle, ready to be uploaded to Platform. An example of how to use it is available in the preprocessing notebook for the Skin lesion segmentation tutorial.
|Make sure that you resize the input image and mask image in the same way, and that the resizing method used does not introduce any grayscale in the mask images, it should consist only of the values 0 and 255.|
Image size should be divisible by 16
Models for image segmentation include a down-sampling path, which shrinks the image by a specific factor, and an inverse up-sampling path, which expands the image back up to the original size.
Image size should be divisible by 16. Recommended image sizes are 192x192, 224x224 or 256x256.
No changes in the Datasets view
You don’t have to change anything in the Datasets view.
Select Image segmentation as Problem type. The Experiment wizard will then build a U-net deep learning network.
On the last 2D Convolution block, make sure the number of filters is 1, and the activation is Sigmoid. You use the sigmoid activation since the prediction for each pixel should be independent of the other pixels.
On the Target block, make sure the loss function is set to Binary crossentropy.
In Evaluation view Predictions inspection you can see a confusion matrix where 0 means background, and 1 means object.
The count represents the number of pixels that have been classified accordingly.
|This may be a sampled count if your dataset is large, so the total sum over the confusion matrix may not equal the total number of pixels in your validation dataset.|