Loss functions
The loss function is a critical part of model training: it quantifies how well a model is performing a task by calculating a single number, the loss, from the model output and the desired target.
If the model predictions are totally wrong, the loss will be a high number. If they’re pretty good, it will be close to zero.
You select the Loss function you want to use in the parameters of the Target block.
During training, the optimizer tunes the model to minimize the loss on training examples.
After at least 1 epoch has run, the loss and metrics plot of the Evaluation view will show the average value of the loss over all the training examples, as well as over the validation examples.
Choosing a loss function
If training a model is like rolling a ball down a hill, the loss function is the profile of that hill: it determines the gradient of the slope, and where the lowest point is.
All the loss functions are minimal when the model prediction is equal to its target.
However, different loss functions will change how the model behaves when strict equality cannot be obtained, e.g., if there is noise, not enough input information, or if the model isn’t perfect.
In practice, the choice of the loss function is mostly directed by the task that the model needs to solve. The following loss functions are available on the platform:
Classification | Regression |
---|---|
Single label: |
Continuous values: |
Multi-label: |
Discrete values: |
Compatibility with activation functions
Some loss functions can only be calculated for a limited range of model outputs.
You can ensure that the model output is always in the correct range by using an appropriate activation function on the last block of the model.
The platform will warn you if the activation function of the last block is incompatible with the loss function selected in the Target block.
Example
The categorical crossentropy loss function needs to calculate the logarithm of the model prediction, which is only possible if the model output is strictly positive.