Detecting defects in mass produced parts

Use image classification to solve real business problems

This tutorial will show you how to develop an AI-model that uses images to detect production faults quickly and efficiently.

Once it has been trained, this model could be implemented straight into a production line to automatically indicate which parts to sell and which to scrap.

Target audience: Beginners

Estimed Time:
  • Setup - 10 minutes

  • Training - 30 minutes

The problem

Identifying faulty parts in a production line is an important task in the manufacturing industry. Deep learning based approaches for defect detection are cheap, quick and reliable.

This tutorial will show you how to develop a model that can detect defective products using images.

You will learn to

  • Import data onto the platform

  • Solve a binary image classification problem

  • Analyze the performance of your model

  • Test the deployment of your model

The data - Images of mass produced parts

In this example, we will be detecting defects in submersible pump impellers. These are mass-produced metal components and the defects featured in this dataset are typical of many manufacturing processes.

This dataset features 1330 grayscale images of shape 256 x 256 pixels. An example of the images is shown below. The images have been labeled as either defective or non-defective.

Defect detection parts

Create a project

Let’s begin!

First, click New project and name it so you know what kind of project it is.

Import the photos

After creating the project, you will be taken to the Datasets view, where you can import data. Click the Data library button and look for the Defects in metal casting dataset in the list.

If you agree with the license, click Accept and import. This will import the dataset to your project, and you will see the dataset’s details where you can edit features and subsets.

Preprocess the data

Use the default data subsets

By default, your data is split into two subsets:

  • Training (80%)
    This is the data that the model uses during training to learn how to classify the images.

  • Validation (20%)
    The validation set is used to evaluate the model, after each epoch, on data that it has not yet seen. This is used to identify the version of the model with the best performance.

Set binary encoding

This problem is an example of binary image classification. This means that the model assigns each photo one of only two possible classes, true or false depending on whether or not the part is defective. Therefore, we will use Binary encoding.

Click the spanner icon to check if the default settings of the Defective feature are appropriate:

  • The Encoding should be Binary

  • The Positive class should be true

Binary encoding

Binary encoding will allow more precise evaluation options at the end of the experiment.

We are now ready to begin building the model. Click Save version, then click Use in new experiment.

Build the model

You will find the Experiment Wizard, which takes you through the following steps:

  1. Define dataset

    • Double-check that you are using the latest dataset version and that you have the appropriate subsets selected as training and validation subsets.

  2. Choose snippet

    • Make sure that your input feature is the images and that your output feature is the label.

    • The snippet we will be using is called EfficientNet B0.

  3. Initialize weights

    • The EfficientNet snippet can be initialized with pretrained weights, which allows to transfer knowledge learned from a different dataset. This helps to get better results with less training time and data.

This completes the Experiment Wizard. Click Create to continue to the Modeling View.

Set image augmentation

To set Image augmentation, click the Input block in the modeling view (the first block in the modeling canvas) and select Natural images.

Image augmentation is a way of slightly manipulating your dataset to add more variation to the training set. This is not strictly neccesary but will likely improve your validation accuracy.

image augmentation

Run the experiment

It is now time to run the model. The default settings work well for this experiment, but feel free to play around with optimization settings as much as you would like to see how it impacts performance.

Once you click Run, the model will take a while so have a break and check back later to see how well it is doing.

Evaluate the experiment

In the Evaluation view, you will find several ways of looking at how your model is performing.

Binrary accuracy and recall

In the Evaluation view, there are various classification loss metrics.

For this particular situation, we are interested in binary accuracy and recall:

  1. Binary accuracy measures the proportion of total predictions that are correct.

  2. Recall measures the proportion of actual positives (defective parts) that were marked as defective.

The reason for looking at both of these metrics is that while it is important to have a high accuracy overall, in this case, false negatives are more harmful than false positives. This is the difference between accidentally discarding a non-defective part and accidentally shipping a defective one.

Confusion matrix and ROC curve

To inspect the predictions of your model, select the subset and checkpoint (epoch) you wish inspect and click Inspect.

The confusion matrix is a good way of understanding exactly how your model has performed It shows you how many predictions fall into each possibility (true positive, true negative, false positive, and false negative).

The Threshold slider shows a sensitivity to making a positive prediction. In this case, because false negatives are much more harmful than false positives, we use a low threshold value.

confusion matrix

Another method which is specific to binary classification is the ROC curve. This is powerful because it lets you analyze its performance on the positive cases.

ROC curve

Deploy your model

The last step in this process is to deploy your model and see how it performs in the real world. To do this, download and unzip the test data - You will find 5 defective and 5 non-defective images that were removed from the other dataset. This means that they are completely unseen by the model so far.

  1. Go to the Deployment view, and click New deployment.
    The Create deployment dialog will appear.

  2. Select your desired Experiment and your best epoch as your Checkpoint.

  3. Click the Enable button to deploy the experiment.

Test how it works using our web app

To test your model, you can use either use the Test Deployment button or use the web app we made specifically for this case below.

  1. Copy the URL and Token and paste into the corresponding fields in app.

  2. Click the image icon to upload one of the test images. Once you have tried it, you can upload another image by clicking the image again.

This web app was built using the Peltarion plugin on a no-code web development service called Bubble. To learn how to build app like this yourself, follow the Create a no-code AI app tutorial.


Congratulations, you have completed the defect detection tutorial! In this experiment, you have learned how to solve a binary image classification problem that can be implemented in real value-adding applications.

It is easy to imagine how this example can be efficiently integrated into a production line to automatically discard any defective parts. To read more about defect detection, read the in-depth blog post.

Was this page helpful?