Tanh is a scaled sigmoid function. The gradient is stronger for tanh than sigmoid, that is, the derivatives are steeper.


Which one of sigmoid or tanh to use depends on your requirement of gradient strength. Tanh resembles a linear function more as long as the activations of the network can be kept small. This makes the tanh network easier to compute.

Tanh curve
Figure 1. Tanh curve
Was this page helpful?
Yes No