Buy or not / Predict from tabular data

Predict if a customer will buy or not based on earlier customers buying patterns

Money!! Understanding what makes a user willing to cash up and buy a product has always been key to businesses.

This tutorial will show you how you can build simple AI models using the spreadsheets that so many of us work with. You will use tabular data to solve a classification problem, and get advice on how you’d also solve a regression problem.

Person - Target audience: Beginners
Clock - Estimated time: Setup - 5 minutes | Training - 10 minutes

You will learn to
Peltarion logo - Import and use tabular data onto the Peltarion Platform.
Peltarion logo - Solve a binary classification problem - Will a customer buy, yes or no?
Peltarion logo - Analyze the performance of your model.

Tab data numbers

The problem - Unleash the power of the spreadsheet

Most of the data that businesses collect are tabular, that is, data that can be stored in a spreadsheet: numerical, categorical, binary, or any combination of those. You name it.

How do you use this data to make really good predictions? Well, there are many ways to make predictions using tabular data, and the Peltarion Platform is a great way to quickly and intuitively leverage your data to make valuable predictions.

Getting started - create a project

Let’s begin! First, navigate to the Projects view.
First, click New project in the Projects view and name it, so you know what kind of project it is.

New project button

Import dataset from Data library

In the Datasets view, click on Import free datasets and choose the Bank marketing dataset. This dataset is used to solve a binary classification problem for a propensity to buy use case.

Bank marketing dataset in the data library
Figure 1. Bank marketing in the data library.

After you have reviewed the information about the dataset, click on Accept and import to accept the terms of the dataset’s license and import it into your project.

Accept and import button

Import your own tabular data

If you want to train a model to make predictions tailored to your usage, you can upload your own tabular data. To do this, you need to upload a comma-separated value (CSV) file. Make sure it follows our requirements. Csv’s can come in many flavors.

Most software such as Microsoft Excel or Google Sheets have a built-in function to export your spreadsheet as a CSV file.

The data

To teach a model to predict results, you need examples of input features coupled with the target historical result.

Input features

The bank marketing dataset uses data from a phone marketing campaign. It contains many features, such as the age, the employment and education of the client, the response to earlier phone campaigns, the day of the week of the phone call.

Target feature

We will use the feature purchased as the target feature, that is, the outcome that the AI model will learn to predict from the input features. purchased contains info about whether or not the client historically subscribed to a term deposit after the phone call.

The Encoding is Binary. That means it has only two possible values:

  • 1 for buying, set to Positive class.

  • 0 for not buying.

Click Use in new experiment, and the Experiment wizard will pop up.

Use in new experiment button

Build your model in the Experiment wizard

The Experiment wizard makes it really easy for you to set up an experiment. Let’s take a look and make sure that all presets are correct:

  • Dataset tab
    The Bank marketing dataset is selected.

  • Inputs / target tab

    • In the Inputs column, select everything except the purchased feature.

    • In the Target column, select purchased as the target feature.
      The target is what the model will learn to predict.

  • Problem type tab
    Given the inputs and target selected, the wizard recommends automatically selecting Tabular classification as Problem type.

Click Create.

Create button

Modeling canvas

The wizard has created a model that fits your tabular data. Every input feature has its own Input block. You can inspect the model if you want.

All settings are pre-populated, and it’s time to train the model. Click Run.

Run button

Evaluation view

In the Evaluation view, you will find several ways of analyzing how your model is performing. The specific metrics shown in the Evaluation view depend on your problem type and loss function.

Loss and metrics curves

The Loss and metrics curves show the performance of your model on the training and validation datasets for different epochs. In general, you are aiming to:

  • minimize loss and error metrics

  • maximize accuracy.

To identify which metrics are most important for your specific application, read more about loss and metrics.

Predictions inspection tab

The Predictions inspections tab lets you analyze the performance of a particular epoch on a particular subset.

ROC curve

The Bank marketing use case is a binary use case, so you’ll get the opportunity to set a threshold. The threshold value allows you to control how the errors made by the model distribute between false positive and false negative.
Slide the Threshold slider to a good value, for example, 0.2.

The features of this section are also dependent on your problem type. Read this article on Prediction inspection to learn more.

ROC curve
Figure 2. ROC curve

Tutorial recap

Congratulations, you have completed this tabular data tutorial! In this tutorial, you have imported and used tabular data on the Peltarion Platform.

You’ve used this data to solve a binary classification problem, that is, a problem with only two categories. You’ve also quickly analyzed the performance of your model.

Next steps

Improve your model

A vital step in successful data science is not just building a working prototype but also going back and experimenting with new iterations of your model to improve the performance.

One easy way is to click Tune experiment and create and run the suggested models.

Tune experiment

You can also read more about improving your experiment in these articles.

Next tutorial

The Improving your tabular data model tutorial gives you guidance for what types of settings and parameters to change when you try to improve your model.


Further reading

With good input data, models like these can be used to make important predictions and solve a wide array of interesting problems. Read more here:

Was this page helpful?