How AI works

Like humans, AI systems aren’t born intelligent. They have to learn and adapt, and this is done essentially in the same way humans do: by taking in information, or data, processing it and storing for future reference.

For a child, it may be learning to walk or run without falling, while for AI it might be learning to keep a car on the road, identifying cancer tumors in CAT scans or finding the most efficient way to ship t-shirts from Portugal to Norway.

The goal of AI is to create systems capable of simulating human intelligence in order to execute tasks. One set of AI techniques is called Machine Learning and has in recent years delivered impressive results. Within Machine Learning there are a number of subfields of techniques and strategies. Two of the most popular ones are Deep Learning and Reinforcement Learning. These can be used separately or work in tandem.

The inner workings of a Deep Neural Network are based on advanced mathematics. But the basic concept is quite easy to grasp. In order to create an AI solution, you need two things:

- A specific task to tackle or problem to solve
- And (a lot of) data associated with that task or problem

Deep Learning & Reinforcement learning
Deep Learning algorithms allow software to train itself to complete tasks that resemble human intuition making it the most efficient and popular AI technique available today. Beating the world champion in Go, predicting cancer, driving cars without humans – almost every conquest in AI to date was reached using Deep Learning. It works by exposing multilayered networks of processor nodes to a lot of data. It is called “deep” because of the many layers of nodes a Deep Neural Network may have.

Deep Learning models can be constructed using supervised or unsupervised learning. Supervised learning uses labelled training data to predict or aim toward specific outcomes of new data, while unsupervised learning can make sense of data without any need for predefined labeled training data. Reinforcement learning, on the other hand, uses feedback algorithms to reward and punish a given model to achieve the best possible outcomes.

How does it work?

Yes, it can all sound very abstract. So say you wanted to teach an AI model to identify every picture of a walrus on the Internet. In a supervised model, you would input thousands of pictures and tell the machine which ones have a walrus in them. This is your training or labeling data. Then, after it’s seen plenty of photos, when you feed it a new photo it should be able to accurately tell you whether it shows a walrus or just an elephant taking a nap. The more training data there is, the smarter (more accurate) the model gets.

An unsupervised model wouldn’t be great for determining walrus vs. no walrus, but could be used to identify patterns in images to form separate clusters of pictures of walruses, pictures of elephant seals and pictures of sea lions, and then let you identify the walrus picture cluster from there.

With reinforcement learning you start without any data, and instead teach the model to solve problems by trial and error. Say you want to teach a machine to play a game like Mario Kart. In the beginning, the software tries things randomly but by rewarding successes and punishing errors the software learns step-by-step how to keep the car on the road. Reinforcement learning is best suited when right and wrong are easy to signal, for example, when teaching a robot to pick up a physical object, not so much when finding walruses.

It’s all about training the model with data, and how that’s done depends on the kind of problem being solved.

How to train a model
For an AI model to succeed with something like learning to recognize animals, it requires training. No matter what, plenty of good data is needed to start training a model. In our walrus example, that data would not only be thousands of photos, it would be thousands of photos that are already labeled to either have walruses in them, or not.

Those training photos are sent through the network, and the model is told to guess whether each photo contains a walrus or not. Each layer in the model works on a different level of walrus identification, from abstract lines and colors, to higher-level shapes and shades, all through the image’s pixels. When the model is told whether it guessed right or wrong, each connection in the model adjusts its weighting to focus in on the features that seem to constitute a walrus (tusks, flippers, blubber), until after thousands or millions of guesses to compare, it gets a pretty good idea of what a walrus looks like. This is supervised training.

The adjustment that takes place after each training image goes through the model is the algorithm attempting to minimize a common feature in AI models called the cost function. The lower the cost, the better the model. And when perfectly trained, our model would be better at spotting walruses than any polar bear out there, something computers were traditionally very bad at, until now. The power of Deep Learning.

Picking the AI brain
Exactly how neural networks get to their solutions is complex for humans to grasp. We don’t know what perfect set of rules the model has made for itself to tell a walrus from a sleeping elephant. But we know that it works. And the process by which it works isn’t magic at all. So let’s have a look at the inner workings of a deep neural network, the AI brain.

The neurons of an AI model are grouped into three different types of layers:

Input Layer
The input layer brings the raw data into the model. This layer then divides the input data into its component parts to be analyzed more closely in subsequent layers. If it’s an image being processed, the input layer will send out the smallest pieces of that image to the next layer where they can be rapidly analyzed up close. All of the real analysis done by a model takes place in those following layers called hidden layers.

Hidden Layers
The hidden layers perform mathematical computations on our inputs. The first hidden layer in an image detection model will identify the most elemental pieces of the image, like edges and simple shapes. The following layers will identify more and more complex pieces of the image like, say, a walrus’s tusk, making its puzzle pieces bigger and bigger until it becomes easier to solve. This is where the “deep” in Deep Learning originates; the more layers, the deeper the network and the wider range of puzzle pieces the model has at its disposal to learn from.

Output Layer
The output layer consolidates and delivers the output data, or in the case of our walrus model, it takes a guess at whether the image has a walrus in it. If the output generated by the AI is proven wrong by the training data’s label--understandable, elephant seals look a lot like walruses--the cost function will calculate how far off the model is so it can adjust its calculations accordingly and try again on the next image. This process is done iteratively over the data set, until the output layer returns no more (or far fewer) mistakes. Then, training is complete and the model can be put to work. Go ahead, find all the walruses.

Machines and humans
While intriguing in theory, it’s easy to see why it would be hard for AI to achieve human level intelligence through this process of neural network training. It takes thousands or even millions of pieces of data to train a machine to learn even one simple task, while a human can read an encyclopedia, go into nature and make some pretty accurate guesses about what they’re seeing. Such is the power of the human brain.

But when it comes to solving problems one at time, when a machine learns something, it can reach far higher accuracies than humans, can work on the problem around the clock, no nights, weekends or holidays off, and it will never forget how to solve it. Such is the power of trained Deep Learning AI. This means the most powerful intelligence we’re able to achieve today is one in which artificial and human intelligences work in tandem.

Get started for free