Let's start with a general overview. All deep learning algorithms fall into the class of Artificial Neural Networks (NNs). NNs take their name from the fact that they mimic the functioning of the human brain, since they consist of interconnected neuron nodes.
Deep Learning algorithms
Deep Learning is a broad field that can be intimidating at first. It is very important to choose the right algorithm for the task in hand, because a wrong model can hurt the performance or makes it impossible to solve your problem. In this article we will give an introduction to 10 of the most important deep learning algorithms, how they work and what they can be used for.
Deep Learning algorithms
02/ How do deep learning algorithms work?
03/ 10 types of deep learning algorithms
How do MLPs work?
When to use MLPs?
MLPs are useful when you have tabular data and you want to solve a classification problem, where to each input you assign the corresponding class. They can also be used for regression problems, where to each input you predict a real value. For example, you can use a MLP to predict daily sales from many parameters, or you can predict if a customer will buy a product based on earlier customer patterns.
05/ Convolutional Neural Networks (CNNs)
Convolutional Neural Networks (CNNs) are a type of neural network that uses convolutional layers. Their name stems from one of the most important mathematical operations in neural networks: convolution. CNNs are another type of feed-forward neural network, where the information flows only in one direction from input to output.
How do CNNs work?
The core concept behind CNNs architecture are convolutional layers, i.e hidden layers that employ convolution. This mathematical operation is very important in deep learning, especially in computer vision. Computers “see” images as big matrices of numbers, but thanks to convolution we can train a model to recognize objects and patterns in an image.
How convolution works. In the image below you can see the structure of a convolutional layer. In the left image, the first matrix on the left represents the input of the convolutional layer. The second, smaller matrix is called filter. You multiply a portion of the bigger matrix for the filter and sum every element of the result. Then you slide the filter across the bigger matrix (as you can see in the second image) and repeat the procedure. You continue until the filter slides across the whole input matrix.
When to use CNNs?
CNN's are widely used in computer vision and time series forecasting. For example, with this tutorial you can learn how to classify an image of a digit as one of the 9 numbers.
06/ Recurrent neural networks (RNNs)
Recurrent neural networks (RNNs) are a class of neural networks that was developed to deal with temporal series, like the text of a paragraph or weather forecasting.
As you read this article, you understand each word based on your understanding of previous words. In the same way, you want the computer to remember what happened previously in the sequence instead of thinking from scratch at every new step. This is the main concept behind RNNs working.
How do RNNs work?
The most simple kinds of neural networks, the so-called feed-forward neural networks, process information from input to output in a single direction. In RNNs, the information travels in two directions: from input to output and recursively from a hidden layer to itself. Their graph can be represented with many loops, and that’s why they are called recurrent neural networks. In this way they can "remember" what happened in the past and make informed decisions for the following steps.
When to use RNNs?
Recurrent neural networks are not appropriate for tabular dataset or image dataset, but they work best with sequences. For example, you can use RNNs for text or speech problems or regression tasks with time series.
07/ Long short term memory networks (LSTMs)
Recurrent neural networks are particularly useful when you are dealing with sequences because they are able to connect previous information to the present task. Sometimes, we only need to look at recent information to solve our problem, but there are also cases where we need more context and we have to go back many steps to find the information we need. If the gap between the relevant information and the present step is very large, RNNs risk forgetting the past information. This is where Long-Short Term Memory networks (LSTMs) become very useful. LSTMs are RNNs that can learn long-term dependencies.
How do LSTMs work?
The key to LSTMs is the cell, a special structure with different gates that control the flow of information. The cell state works as a highway where the information can flow from the beginning to the end of the network. The LSTM does have the ability to remove or add information to the cell state with special layers called gates. This way the LSTM can keep relevant information for long periods of time.
When to use LSTMs?
LSTMs are a type of recurrent network, so you can use them to solve problems with sequences, like speech recognition, grammar learning, music composition.
08/ Generative adversarial networks (GANs)
Generative Adversarial Networks (GANs) are deep learning models used for generative modeling. Generative modeling means automatically learning the patterns in input data in such a way that the model can be used to generate new examples. When you train a GAN, you obtain a model that can create new data that are very similar to the examples in the original dataset.
How do GANs work?
GANs are formed by two sub-models:
- A generator model that is trained to generate new examples
- A discriminator model that tries to classify examples as either real (from the real dataset) or fake (generated from the generator model).
The two models are trained together in an adversarial way, until the generator can fool the discriminator model, meaning that the generator model is generating plausible examples.
When to use GANs?
GANs are great to generate new plausible examples. You can use GANs to create new photograph similar to the ones in your dataset or to generate new levels in a videogame.
Autoencoders are a special kind of neural networks used for representation learning. Representation learning means finding patterns in data. This technique is useful when you want to compress a large dataset in a smaller dimension.
How do autoencoders work?
The basic architecture of an autoencoder consists of two part, the encoder and the decoder:
- The encoder maps the input into a compressed version, the code;
- The decoder tries to reconstruct the input from the code.
The simplest way to perform this task would be to simply copy the input. Instead, we impose a bottleneck so that the decoder is forced to reconstruct the input approximately, preserving only the most relevant aspects of the data in the copy. This way the network learns what are the most important features of the data.
When to use autoencoders?
Autoencoders are used for a variety of tasks, like dimensionality reduction, image compression or denoising. In this tutorial you can learn how to use an autoencoder for image denoising.
10/ Radial Basis Function Networks (RBFNs)
Radial Basis Function Networks are a kind of neural network used for function approximation problems.
How do RBFNs work?
RBFNs are based on Radial Basis functions. A radial function is a function that changes based on the distance from a location.
When to use RBFNs?
The main advantage of RBFNs is their training speed. Among RBFNs applications there are function approximation, time series prediction, classification, and system control.
11/ Self Organizing Maps (SOMs)
A Self-Organizing Map (SOM) is a kind of neural network which is used for dimensionality reduction.
Dimensionality reduction means to represent the input space in a lower dimension. When you have a high dimensional dataset with a lot of features it is very difficult to visualize it. The features might also be correlated to each other, hence they can be redundant. Reducing the dimension of the dataset helps both visualization and exploratory data analysis of high dimensional datasets.
How do SOMs work?
In a self-organizing map neurons are arranged in a 2-dimensional grid that can take the shape of either rectangles or hexagons. During the learning phase, neurons on the grid will gradually coalesce around areas with high density of data points. As the neurons move, they bend the grid to reflect the shape of the data. Areas with many neurons reflect underlying clusters in the data.
When to use SOMs?
SOMs are used for dimensionality reduction. Dimensionality reduction is useful when you have a dataset with large numbers of variables, such as in the fields of neuroinformatics, bioinformatics or signal processing.
12/ Restricted Boltzmann Machines (RBMs)
A restricted Boltzmann machine (RBM) is a type of generative model. Generative modeling means automatically learning the patterns in input data in such a way that the model can be used to generate new examples.
How do RBMs work?
Unlike normal neural networks, a Boltzmann Machine has connections among the input nodes. All the nodes are connected to all other nodes irrespective of whether they are input or hidden nodes. In this way, they share information among themselves and they are able to capture all the patterns and correlations among the data. Thanks to this structure the network learns how to generate new data.
When to use RBMs?
RBMs are useful for dimensionality reduction, classification, regression or feature learning.
13/ Deep Belief Networks (DBNs)
A Deep Belief Network (DBN) is a generative model capable of creating new plausible samples by learning the probability distribution of your data.
How do DBNs work?
DBNs learn to probabilistically reconstruct their input. DBNs consist of multiple stacks of Restricted Boltzmann machines, and each layer works as feature detectors.