Binary crossentropy is a loss function used on problems involving yes/no (binary) decisions. For instance, in multi-label problems, where an example can belong to multiple classes at the same time, the model tries to decide for each class whether the example belongs to that class or not.
where ŷ is the predicted value.
Binary crossentropy measures how far away from the true value (which is either 0 or 1) the prediction is for each of the classes and then averages these class-wise errors to obtain the final loss.
The block before must have a Sigmoid as activation function.
You use binary crossentropy on multi-label problems.
Example: You want to determine the mood of a piece of music. Every piece can have more than one mood, for instance, it can be both "Happy" and "Energetic" at the same time. To solve this problem you use binary crossentropy.