Self sorting wardrobe
Solve an image classification problem with a convolution neural network (CNN)
This tutorial will go into the details of how to build a deep learning experiment on the Peltarion Platform. Showing you why you take different steps on your journey to solving your problem.
- Target audience: Beginners
You will learn to
- Understand the building blocks and settings of a deep learning model.
- Run multiple experiments.
Preread: We suggest that you start your deep dive into the Peltarion Platform with the Deploy an operational AI model tutorial. It’ll walk you through the workflow at a higher level than this tutorial.
McKinsey Global Institute identifies the retail industry as the sector likely to create the most annual value from AI techniques (a potential $600B). They state in 2017 “AI technologies could eliminate many levels of manual activities in areas such as promotions, assortments, and supply chain. AI will enable retailers to increase both the number of customers and the average amount they spend by creating personal and convenient shopping experiences.”
Deep learning image classification has many applications in the retail industry and will drive much of this predicted value, for example, reducing errors in supply chain management (accurate inventory/catalog management by automatically identifying items from photo) or as a component in visual search from user-generated content (a customer uses a photo taken on their mobile device to locate or search for similar items in a shopping catalog).
First, create a project and name it, so you know what kind of project it is. Naming is important.
A project combines all of the steps in solving a problem, from the preprocessing of datasets to model building, evaluation and deployment.
Add and manage the dataset
After creating the project, you will be taken to the Datasets view, where you can import data.
Click the Import free datasets button.
Look for the Fashion MNIST - tutorial data dataset in the list.
The Fashion MNIST dataset is an image classification dataset that consists of small, 28 x 28 pixels, images of clothing or accessories such as shirts, bags, shoes, and other fashion items. Each image is annotated with a label indicating the correct garment. The images come from Zalando and consist of a training set of 60,000 examples and a test set of 10,000 examples.
If you agree with the license, click Accept and import.
This will import the dataset in your project, and you can now edit it.
Subsets of the Fashion MNIST dataset
Click Show advanced settings.
The dataset is by default split into three subsets:
That means we’ll use 80% of the set to train our model and the remaining as "not seen before" examples to see how well the training is progressing.
The percentage of correctly guessed labels on the validation subset will produce the accuracy number we’re after.
Keep the default subset split.
Build a model with the Experiment wizard
You’ve now created a dataset ready to be used in the platform. It’s time to create an experiment!
Click Save version and then Use in new experiment.
The experiment wizard pops up.
Make sure that the FashionMNIST dataset is selected.
Inputs / target tab
Select image as Input feature and category as Target feature.
Problem type tab
Select the Single-label image classification.
Single-label image classification is when a deep learning model predicts one class for each example.
A complete Convolutional Neural Network (CNN) now populates the Modeling canvas.
A CNN is often used when you want to solve an image classification problem. This network looks for low-level features such as edges and curves and then builds up to more abstract concepts through a series of convolutional layers. The CNN consists of the following types of blocks:
2D Convolution. This block is used to detect spatial features in an image.
Max pooling 2D. This layer reduces the size of the data. You can say that 2D max pooling is similar to scaling down the size of an image.
Batch normalization. This normalizes all input features to a similar range of values which will speed up learning.
Dense. This is a densely connected neural network layer.
Although the CNN model, in this case, will identify fashion items, it can be trained on any class you require. For example, in healthcare, this technique could take a brain scan as input and predict if it contains a tumor or it could be used to classify a personal photo library, much as Apple or Google do with their photo applications.
If you want to dive deeper into CNNs and how they work, we suggest you read this article:
A Beginner’s Guide To Understanding Convolutional Neural Networks.
The experiment is done and ready to be trained. All settings have been pre-populated by the platform, for example,
In the last Dense block in the model the Units value is set to 10. This is the number of labels (shirt, bag, shoe,etc.).
The Activation is set to Softmax.
This activation function is often used together with categorical crossentropy. The softmax function highlights the largest values and suppresses low values. This in effect allows only 1 of the 10 nodes of the dense layer to put its hand up. There are times when you don’t want to squash all in favor of one (like saying this t-shirt is both red and has a logo), but in this experiment we’re trying to say "this is a t-shirt and not a coat or a bag or trousers or…”.
The target feature is set to category in the Target block.
The Loss, is set to Categorical crossentropy. This loss function computes a score that the model uses to decide which garment it is depicted in the image. If there were only 2 classes in our data (t-shirts and shoes), you could choose binary crossentropy. We have 10 classes, thus categorical crossentropy.
Run experiment to train model
Your model is now ready to be trained. Navigate to the Settings tab in the Inspector. For your info:
Batch size is how many rows (examples) that are computed at the same time.
One Epoch is when the complete dataset has run through the model one time. That means that if you set Epochs to 10 the complete dataset has run through the model a 10 times.
The Optimizer is how the system optimizes the loss with respect to the weights of the network.
Done! Click Run.
OK, let’s see if you can improve the model while the first model trains. You can probably find some inspiration in our articles on how to improve experiments—both for beginners and for intermediate users.
|If you’re on the Free plan you can run 1 experiment at a time. All other plans can run concurrent experiments.|
From the Evaluation view you can click the Tune experiment button.
With the Tune experiment function, the platform suggests a couple of changes that might improve your model’s performance. For this use case you can increase or decrease the learning rate.
To make more modifications to the model, navigate back to the Modeling view, and click on Duplicate. This will create a copy of your current model that you can edit, but training progress will be lost.
Other ways to improve the model is to add blocks and change settings in the model. As long as you keep the same loss function, you can compare the results of the experiments and see which one is the best in the Evaluation view.
Another way to improve the model is to click on Iterate in the Modeling view. Now you can do the following:
Reuse part of model creates a new experiment with a single block that contains the model you just trained. This is useful to build another model around the current one.
Navigate to the Evaluation view and select your experiment in the left side Experiment section.
Aim for high accuracy
In this tutorial, it’s validation accuracy that matters. Accuracy is how often you predict the right answer, or actually, the formula is:
For example, if the number of test samples is 1,000 and the model classifies 952 of those correctly, then the model’s accuracy is 95.2%.
We’ll set our goal to 0.9, i.e., correct prediction 9 out of 10 times. Note that the “world record” is 0.967. For more benchmarks, check Zalando’s collected benchmark classifiers here: Fashion MNIST benchmark.
Select the Accuracy graph and you’ll notice that your first experiment the training and validation graphs diverge after a few epochs, the experiment starts to overfit. That means that the model is just memorizing the picture vs. understanding in more general terms what the shapes and shadows may depict.
At the best epoch we have accuracy 0.91. Does that beat the goal you set? Yes, it does!
The Confusion matrix in the Predictions inspection tab is used to see how well a system does classification. In a perfect classification, you’ll have 100% on the diagonal going from top left to bottom right.
This tutorial has shown you how fast and easy you can create and test experiments on the Peltarion Platform. You’ve built a basic CNN model to solve a classification problem and acquired some basic deep learning knowledge.
Next steps / Read more
In this tutorial, only one label can be correct, but what if the object could have many labels, i.e., a shirt can be labeled both “red” and “short sleeve.” How do you solve such a problem? Check out our tutorial Predicting mood from raw audio data and learn how to solve a multi-label classification problem.