Target audience: Data scientists and Developers
Preread: We suggest that you start your deep dive into the Peltarion Platform with the Deploy an operational AI model tutorial. The link below is a proposed pre-read material for those unfamiliar with CNNs and how they work. It does not endeavor to be an exhaustive learning reference. A Beginner's Guide To Understanding Convolutional Neural Networks
McKinsey Global Institute identifies the retail industry as the sector likely to create the most annual value from AI techniques (a potential $600bn). They state in 2017 “AI technologies could eliminate many levels of manual activities in areas such as promotions, assortments, and supply chain. AI will enable retailers to increase both the number of customers and the average amount they spend by creating personal and convenient shopping experiences.”
Deep learning image classification has many applications in the retail industry and will drive much of this predicted value, for example, reducing errors in supply chain management (accurate inventory/catalog management by automatically identifying items from photo) or as a component in visual search from user-generated content (a customer uses a photo taken on their mobile device to locate or search for similar items in a shopping catalog).
Image classification in deep learning is often implemented using a technique called a convolutional neural network (CNN). In this tutorial, we will be building a CNN that will be trained on thousands of images of fashion items (e.g., clothing, accessories, shoes) that will create a model that can be used to predict for a given image what type of fashion item it is.
Although the CNN-model, in this case, will identify fashion items, it can be trained on any class you require. For example, in healthcare, this technique could take a brain scan as input and predict if it contains a tumor or it could be used to classify a personal photo library, much as Apple or Google do with their photo applications.
You will use the Fashion MNIST dataset which is an image classification dataset that consists of small, 28 x 28 pixels, images of clothing or accessories such as shirts, bags, shoes, and other fashion items. Each image is annotated with a label indicating the correct garment. The images come from Zalando and consist of a training set of 60,000 examples and a test set of 10,000 examples.
In this tutorial, it’s validation accuracy that matters. Accuracy is how often you predict the right answer, or actually, the formula is:
For example, if the number of test samples is 1000 and the model classifies 952 of those correctly, then the model's accuracy is 95.2%.
We’ll set our goal to 0.9, i.e., correct prediction 9 out of 10 times. We can note that the “world record” is 0.967. For more benchmark, check Zalando’s collected benchmark classifiers here: Fashion MNIST benchmark.
First, create a project and name it, so you know what kind of project it is. Naming is important.
A project combines all of the steps in solving a problem, from pre-processing of datasets to model building, evaluation and deployment.
As you can see in the Inspector on the right, the dataset is by default split into two subsets: 80% for training and 20% for validation. That means we'll use 80% of the set to train our model and the remaining as 'not seen before' examples to see how well the training is progressing. The percentage of correctly guessed labels on the validation set will produce the accuracy number we're after. Use the default subsets for this tutorial.
You’ve now created a dataset ready to be used in the platform. Click Save version. Saving this version will lock and version it and allow you to build a model with it. Navigate to the Modeling view.
Time to create an experiment in the Modeling view. The experiment contains all the information needed to reproduce the experiment:
The result from this experiment is a trained AI model that can be evaluated and deployed.
The experiment is done and ready to be trained. Navigate to the Settings tab in the Inspector. In the Run settings section change Batch size to 256 and keep the default values for the rest. For your info:
Done!! Click Run in the top right corner.
Navigate to the Evaluation view and select your experiment in the left side Experiment section. Watch it train epoch by epoch.
Select the Accuracy graph and you’ll notice that after epoch 5 the training and validation graphs diverge, the experiment starts to overfit. That means that the model is just memorizing the picture vs. understanding in more general terms what the shapes and shadows may depict.
At the best epoch, number 5, we have accuracy 0.91. Does that beat your goal you set at the beginning of this tutorial? Yes, it does!!
If you click Categorical crossentropy you’ll see the Loss graph. Loss indicates the magnitude of error your model made on its prediction. It’s a method of evaluating how well your algorithm models your dataset. If your predictions are totally off, your loss function will output a higher number. If they’re pretty good, it’ll output a lower one.
The Confusion matrix is used to see how well a system does classification. In a perfect classification, you'll have 100% on the diagonal going from top left to bottom right.
OK, let’s see if you can improve the model and get at higher accuracy. You see that after epoch 5 the training and validation graphs diverge, it starts to overfit quite a lot. The first thing to do would be to try some regularization, like increasing the drop-rate of the Dropout blocks. By default it’s set to 0.1, meaning 10 % of the input units are dropped, but you can try experimenting with higher values. If the training loss and validation loss becomes more similar the experiment is not so obviously over-fitting.
Other ways to improve the model is to add blocks and change settings in the model. As long as you keep the same loss function, you can compare the results of the experiments and see which one is the best in the Evaluation view.
Examples of what you can change:
This tutorial has shown you how fast and easy you can create and test experiments on the Peltarion Platform. You've built a basic CNN-model to solve a classification problem and acquired some basic deep learning knowledge.
In this tutorial, only one label can be correct, but what if the object could have many labels, i.e., a shirt can be labeled both “red” and “short sleeve”. How do you solve such a problem? Check out our tutorial Predicting mood from raw audio data and learn how to solve a multi-label classification problem.