Image similarity / cheat sheet
Target audience: Data scientists and developers
Use this cheat sheet if you build projects where you want to find similar images.
What is image similarity?
Image similarity is a way to quantify how similar 2 images are.
You do this by converting all images in your dataset to compressed vectors with a deep learning model.
These vectors are added to an index.
When you want to find a similar image in your dataset to a new image, you run the new image through the model as well. Then you compare the new image’s vector to all dataset images in the index to find the most similar ones.
Image similarity search is fast since the data becomes so compressed.
Example use cases
Search and find similar looking items.
Search and find items that are difficult to filter using keywords.
Nothing special needs to be done for this problem type when you add your image dataset to the platform.
Select one of the different ways in the Datasets view; Import from your data warehouse, directly from your local computer, via data API, or from a URL.
Use the Experiment wizard to choose your snippet. Snippets are pre-built neural network architectures available on the platform.
In the Snippet tab select Image similarity as Problem type. Then select the EfficientNet B0 Embedding snippet.
The EfficientNet B0 Embedding will now show up on the Modeling canvas.
Train for 1 epoch
Navigate to the Settings tab and set only 1 epoch. That’s all you need.
Click Run to start the training.
Skip the Evaluation view. Since you only train for 1 epoch.
Create new deployment
In the Deployment view, click New deployment and select Similarity search.
Select your experiment, epoch 1 for deployment, and Image embedding as Output feature.
Finally, click Create.
Now all the images in the dataset pass through the model once, and the platform builds the index.
Click Enable to make the deployment available for REST API calls.
Test the deployment
You can test that it works with our Image similarity - API tester.
API call workflow
This is what will happen when you send an image to your deployed model.
The new image will pass through the deployed model and be converted into a vector.
Compare the new image’s vector with all vectors in the index.
The model returns the most similar images.