The sigmoid function generates a smooth non-linear curve that squashes the incoming values between 0 and 1. The sigmoid function works well for a classifier model but it has problems with vanishing gradients for high input values, that is, y change very slow for high values of x.

Example: If you have input values x of [1, 3, 10, 500, 10000, 10000000], y will change well enough for the lower values but not for the high values. The information in the high values will, therefore, be lost.

The sigmoid function is often used together with the loss function binary crossentropy that you set in the Target block.

\[f(x) = \frac{1}{1 + e^{-x}}\]
Sigmoid curve
Figure 1. Sigmoid curve
Was this page helpful?