Tips to improve for intermediate users

Improve you model architecture

  • Start simple - then increase model complexity
    As a rule of thumb, a small dataset will need a smaller, less complicated architecture. Start simple and then increase the number of blocks between the first and the last one in subsequent experiments. This will allow you to systematically try different options and see how this impacts performance.
    Try to increase your model complexity until you obtain a low training loss, then if you are overfitting your data (i.e if there is a large gap between your training and validation loss) you can use regularization techniques or decrease a little bit the model complexity.

  • Change loss function
    The loss function is usually tightly tied to the problem that you want to solve. However you can still try different loss functions, like MSE or MAE for regression problems and slightly improve your results.

  • Regularization techniques
    If you are overfitting your data, you can add Dropout blocks as a regularization technique. Start with a dropout rate of 0.1 and then increase it up to 0.5 if you are still overfitting your data.
    Note that Dropout rate of [.userinput]#0 (the default) is the same as not having the Dropout block, and Dropout rate of 1 (or close to 1) basically destroys the model, since it throws away the results of the computations above. Indeed, Dropout rate higher than 0.5 is generally not used. Early stopping is also useful to avoid overfitting, because you stop the training before the validation error increases too much.

Advanced experiment tuning

Once you have settled on the overall model structure but want to achieve an even better model you can start some advanced tuning of your model. Here are some ideas on what you can do.

  • Change optimizer & optimizer parameters
    The default optimizer on the platform is Adam. Once you have settled on the overall model structure but want to achieve an even better model it can be appropriate to test another optimizer.
    Adam is the default optimizer so you should try AdamW or regular stochastic gradient descent (SGD) with momentum. This is classic hyper parameter fine tuning where you try and see what works best. Any of these optimizers may achieve superior results, though getting there can sometimes require a lot of tuning of other Run settings parameters, for example, learning rate.
    How and where? Change Optimizer in the Settings tab in the Modeling view.

  • Change specific parameters in included blocks
    Modify block-specific parameters clicking on the block that you want to change.
    How and where?

    • Change Dropout rate in Dropout block. This may reduce overfitting and improve performance. Good values for the dropout rate should be between 0.1 and 0.5.

    • Try different filter sizes in convolutional layers. Small filters like 3x3 or 5x5 usually perform better.

    • Change the stride, width and height of pooling in Pooling layers.

  • Change Learning rate schedule
    There is no go-to learning rate schedule for all models.
    Changing the learning rate, in general, has shown to make training less sensitive to the learning rate value you pick for it. So using a learning rate schedule can give better training performance and make the model converge faster.
    Try exponential decay. The exponential schedule divides the learning rate by the same factor (%) every epoch. This means that the learning rate will decrease rapidly in the first few epochs, and spend more epochs with a lower value, but never reach exactly zero.
    How and where? Change Learning rate schedule in the Settings tab in the Modeling view.

Was this page helpful?