Copy blocks with weights to another model

This feature is used when you want to copy blocks with weights from a trained experiment to a new experiment. For example, when you want to adapt a model without retraining it from scratch. Then you need to have detailed control over what exactly is being copied.

You’ll find the icon to copy blocks with weights in the upper left corner in the Modeling canvas. It looks like this Copy blocks with weights.

Multiple inputs

One occasion when you may want to use copy with weights is when you have a model with more than one input dataset. In this case, it’s a good idea to create an experiment with one type of input for each input dataset and then optimize each experiment separately. When you’ve done this, copy the trained blocks with weights and paste them into a new experiment. This way of working makes it easier to optimize each net.

Example: Tabular and image inputs

In the Predict California house prices tutorial, we build one experiment with two inputs. But you could just as well build two experiments, one for each input. First, train the net for the tabular input until you are satisfied. Then train the net for image input until you are satisfied. Now copy the trained blocks with weights, and paste the blocks into one final experiment, concatenate the input nets to one net and train the model to see if you get better predictions.

Transfer learning

Copy with weights can be used if you want to transfer knowledge from one of your models to another model. This is known as transfer learning in machine learning terms. It can be useful if you have trained on a big dataset (such as ImageNet) and want to reuse parts of that knowledge to solve other, related, problems.

Say you want to tell if an image depicts a truck or a car. Then train you first model ImageNet, the model then learns that a car has wheels. It was never given any information about the wheels, it was inferred from lots of pictures of cars, and the label “car”. The model represents this information in the weights it learns.

Now copy the weights from the first trained model into a new model. The new model could now easily be used to recognize trucks, as they to have wheels. Your new model could also learn the difference between a car and a truck. Without needing to train on cars.

Another possible application is when you build and train an autoencoder and then copy the encoder part (and possibly the decoder part too) separately into a new model.

Trainable or non-trainable

In the Copy with weights pop-up, you can select if you want the copied blocks to be trainable or non-trainable in the new experiment.

Trainable means that the weights associated with the blocks will update during the training.Often you want to start with non-trainable otherwise there is a risk of overfitting, i.e., that the experiment is too closely fit to a limited set of data points.

Note
Be aware of where the weights come from when copying and re-using blocks with weights. There is nothing in the platform preventing some or all of the samples used to train the weights can exist in the validation set of the new model. Then your results will show that your model generalizes better than it actually does.

These blocks can be copied with weights

  • Dense

  • Embedding

  • Batch normalization

  • 2D convolution

  • 2D deconvolution

  • LSTM

Note
All other blocks can be copied as well but they do not have any weights so it’s pointless.

Copy with weights parameters

Click the Copy with weights pop-up icon Copy blocks with weights in the upper left corner in the Modeling canvas.

In the Copy with weights pop-up, you select from which epoch you want to copy weights.

For most experiments, the optimal epoch is the same for all but you may get different options to chose on.

  • Latest.

    The last epoch from the training. This option is always available.

  • Minimum validation loss.

    The epoch where the model performed best (according to the loss function) for the validation dataset. Lower is better (if you have a validation set).

  • Minimum training loss.

    The epoch where the model performed best (according to the loss function) for the training dataset. Lower is better.

  • Best validation accuracy.

    The epoch where the model performed best (according to the accuracy of its predictions on the validation dataset). Higher is better. Not applicable for regression models. Accuracy: Correct predictions/total no of predictions.

  • The epoch where the model performed best (according to the accuracy of its predictions on the training dataset.

    Accuracy: Correct predictions/total no of predictions). Higher is better. Not applicable for regression models.

Try the platform