Binary crossentropy is a loss function used on problems involving yes/no (binary) decisions. For instance, in multi-label problems, where an example can belong to multiple classes at the same time, the model tries to decide for each class whether the example belongs to that class or not.
The block before the Target block must use Sigmoid as activation function.
When to use binary crossentropy
You use binary crossentropy on multi-label problems.
Example: You want to determine the mood of a piece of music. Every piece can have more than one mood, for instance, it can be both "Happy" and "Energetic" at the same time. To solve this problem you use binary crossentropy.
This is what we do in our tutorial Predicting mood from raw audio data.
Binary crossentropy math
where ŷ is the predicted value.
Binary crossentropy measures how far away from the true value (which is either 0 or 1) the prediction is for each of the classes and then averages these class-wise errors to obtain the final loss.
You can read more on how to use binary crossentropy in our cheat sheet for multi-label image classification.
We also have a topic about Binary crossentropy in our Glossary.