Modeling view - with and without standardization on image data / Example workflow

Let’s compare the same model performance based on different versions of datasets - with and without normalization. The dataset Predict California house prices will be used in this workflow(here). Read the example workflow about preparing different versions of this dataset on our create different versions of a dataset page.

Step 1: Create experiments with dataset version NoStdImage/TargetStd

Click New experiment, name it NoStdImage/TargetStd.

Use CNN snippet(here) as the neural architecture since it is a simple one for image data.

Step 2: Configure the dataset settings

Set the Dataset version to NoStdImage/TargetStd.

Step 3: Configure the blocks settings in the CNN snippet

Select the Input block, set the Input feature to image_path. Select the Target block, set the Target feature to Target_medianhouseValue, set the Loss to Mean square error. Select the last Dense block, set the number of Nodes to 1 and set Linear as the Activation function.

Step 4: Config the settings for running the model

Navigate to the Settings tab in the Inspector, set the Batch size to 128. Set the Epochs to 50. Set the Data access seed to 2 for all of the experiments.

Step 5: Click Run

Step 6: Duplicate the experiment

While it’s running, duplicate this experiment (without weights) with the default name NoStdImage/TargetStd 2. Duplicate NoStdImage/TargetStd 2, resulted in NoStdImage/TargetStd 3, let them run with the same settings. Remember to change the Data access seed to 2. The purpose of running several experiments with the same settings is to get the average loss value to compare with the averaged loss value generated by different versions of dataset (with and without standardization). Keep in mind that statistical test requires more experiments in order to draw a conclusion.

Step 7: Create experiments with dataset version StdImage/TargetStd

Duplicate experiment NoStdImage/TargetStd, so that the same settings are inherited and only change the Dataset version to StdImage/TargetStd in the modeling view.

Following Step 6, create 3 experiments with the dataset version StdImage/TargetStd.

The Modeling view enables experiment source tracking. Click the experiment link at the bottom of the Modeling canvas, to see the source experiment. The experiment creator and the created date are shown on the bottom right.

Conclusion

It’s recommended to rename each experiment with meaningful keywords so that it helps to monitor and compare the experiments with different settings. Next, compare the results on evaluation view.