Tanh is a scaled sigmoid function. The gradient is stronger for tanh than sigmoid, that is, the derivatives are steeper.
Which one of sigmoid or tanh to use depends on your requirement of gradient strength. Tanh resembles a linear function more as long as the activations of the network can be kept small. This makes the tanh network easier to compute.
Stay in the know by signing up for occasional emails with tips, tricks, deep learning insights, product updates, event news and webinar invitations.
We promise not to spam you or share your email with any third party. You can change your preferences at any time. See our privacy policies.
Please check your email inbox account to confirm, set, or update your communication preferences.