Predict real estate prices

Solve a regression problem using table data and images

If you deploy the final trained AI model from this tutorial in real life, someone could load the location, size of their house, etc., via an online portal and get a valuation. Nice!

Getting a good estimate of the price of a house is hard even for the most seasoned real estate agents. With the advent of deep learning, it is now possible to get a much more sophisticated valuation as we can now use several data types — such as images and table data.

Person - Target audience: Beginners

You will learn to
Peltarion logo - Solve a regression problem. A problem where you predict a quantity, e.g., a price.
Peltarion logo - Use multiple datasets, both tabular data and images.
Peltarion logo - Run multiple experiments and compare them.

Preread:
Before following this tutorial, it is strongly recommended that you complete the Deploy an operational AI model if you have not done so already.


Create a project

Start by creating a project on the Projects view by clicking on New project.

New project icon

Add dataset to the project

In the Datasets view, click the Import free datasets button.

Import free datasets button

Look for the Cali House - tutorial data dataset in the list. This dataset consists of map images of the blocks from Open street map and tabular demographic data collected from the California 1990 Census.

If you agree with the license, click Accept and import.

This will import the dataset in your project, and you can now edit it.

House with a view

Each sample in the dataset gives the following information about one block of houses;
Median house age, Total number of rooms, Total number of bedrooms, Population, Number of households, Median income, and Median house value.

In this tutorial, we wish to make an AI model that learns to predict the price of a house, here called medianHouseValue, given the other available data (i.e., median house age, population, etc.).

Hence, medianHouseValue is our target feature, while the others are our input features.

Normalize image feature

You normalize a dataset to make it easier and faster to train a model.

Locate the image_path feature and click the wrenchwrench.
Change the Normalization from None to Standardization.

Standardization converts a set of raw input data to have a zero mean and unit standard deviation. Values above the feature’s mean value will get positive scores, and those below the mean will get a negative score.
The reason we normalize or scale input data is simply that neural networks train better when the data comes roughly in an interval of -1 to 1.

Create a tabular feature set

A feature set is two or more features that you want to treat in the same way during modeling.

This feature set consists of the tabular data on the houses, for example, number of bedrooms and median income.

Click on New feature set, name the feature set 6_features and select the information on the houses:

  • housingMedianAge (1)

  • totalRooms (1)

  • totalBedrooms (1)

  • population (1)

  • households (1)

  • medianIncome (1)

Click Create.

Create button

Create experiment for tabular data

Now that we have the data let’s create the AI model. We’ll start by just trying to predict the prices from the tabular data.

Experiment

On the Peltarion Platform, an experiment is the basic unit you’ll be working with. It’s the basic hypothesis that you want to try, i.e., “I think I might get good accuracy if I train this model, on this data, in this way.”

An experiment contains all the information needed to reproduce the experiment:

  • The dataset

  • The AI model

  • The settings or parameters used to run the experiment.

The result is a trained AI model that can be evaluated and deployed.

Experiment wizard

Click Create an experiment to open the Experiment wizard.

Create an experiment button
  • Give the experiment a good name, for example, Tabular data experiment.

  • Dataset tab
    Make sure that the Cali House dataset is selected.

  • Inputs / target tab, make sure:

    • The Input feature is 6_features (the feature set you created)

    • The Target feature is medianHouseValue.

  • Problem type tab
    Select Tabular regression in the drop-down menu. This is when a trained model predicts a value or the probability of a target.

  • Click Create. This will add a complete deep learning model to the Modeling canvas.

Create button

Run experiment from modeling canvas

The Experiment wizard has pre-populated all settings needed:

  • The Loss in the Target block is set to Mean Squared Error (MSE). MSE is often used when doing regression, when the target, conditioned on the input, is normally distributed.

  • The last Dense block has Units set to 1 because we want only one prediction.

Time to train the model and see if we’ve come up with a good model.

Done! Click Run.

Run button

Create experiment with two inputs - tabular and image

Watch the experiment train in the Evaluation view. While the first experiment runs, you can build and run a concurrent experiment to find out if you can improve the experiment.

  1. Navigate to the Modeling view.

  2. Click the Iterate button.

  3. Open the Reuse part of model tab.
    Select the best checkpoint and keep Target as Terminal port

Note
If you’re on the Free plan you can run 1 experiment at a time. All other plans can run concurrent experiments.

Build a model with multiple inputs

The already trained model will now show up in the Modeling view as a separate block. We call it a user block. The block gets its name from the previous experiment’s name, for example, Tabular data experiment.

Now build a model based on the user block according to this illustration:

Complete model
  1. Connect an Input block to the user block.
    Set Feature to the feature set you created, 6_features.

  2. Add a new Input block.
    Set Feature to the image, image path.

  3. Add an EfficientNet B0 block and connect it to the image input.

  4. Expand the Transform heading and add a Concatenate block.

  5. Connect the output from the User block and the Efficientnet block to the input of the Concatenate block.

  6. Add a Dense block.
    Set Units to 512.
    Set Activation to ReLU.

  7. Add a Dense block.
    Set Units to 1.
    Set Activation to Linear.

  8. Add a Target block.
    Set Feature to medianHouseValue.
    Set Loss function to Mean squared error.

  9. Done!!

Click Run, and move on to compare the two experiments.

Run button

Analyze experiments

The Evaluation view shows in several ways how the training of the model has progressed and how your experiments are performing.

As long as you keep the same loss function, you can compare the results of the experiments and see which one is the best.

Did the second input help?

Loss graph

The lower the loss, the better a model (unless the model has over-fitted to the training data). The loss is calculated on training and validation and its interpretation is how well the model is doing for these two sets. It is a summation of the errors made for each example in training or validation sets.

Loss graph
Figure 1. Loss graph

Prediction scatter plot

Navigate to the Predictions inspection tab and take a look at the scatter plot.

In a perfect scatter plot, you’ll have 100% on the diagonal going from bottom left to top right.

Prediction scatterplot
Figure 2. Prediction scatterplot

Tutorial recap

Congratulations, you’ve completed the California house pricing tutorial. In this tutorial, you’ve learned how to:

  • Solve a regression problem, first by using one input and then by extending the experiment using multiple datasets.

  • Analyze the experiments to find out which one was the best.

Good job!

Next tutorial - Sales forecasting with spreadsheet integration

We suggest that the next tutorial you should do is Sales forecasting with spreadsheet integration.

You will learn to:

  • Build a deep learning model with no code.

  • Predict sales numbers from spreadsheet data.

  • Deploy your model for production on the Peltarion Platform.

  • Integrate your model with Google Sheets or Microsoft Excel.

SalesForecast
Was this page helpful?
YesNo