We’ve taken the next step in enabling our users to have full-scale language model capabilities available at their fingertips, in an easy and accessible way. By releasing Natural Language Processing (NLP) capabilities to our platform in September, we have taken a leap even further! Let us introduce to you: BERT.
Get ready for our pre-trained BERT
What is BERT and why it matters
BERT stands for Bidirectional Encoder Representations from Transformers and is a new, state-of-the-art deep learning language processing model. The idea and theory behind BERT was originally introduced by a Google AI Language research paper in 2018.
Since being introduced, BERT has quickly outperformed other Natural Language Processing (NLP) methods, become a state-of-the-art solution and has been a huge breakthrough for the entire industry.
BERT performs significantly better than all other previous language models
- BERT is bi-directional. This means it actually understands the meaning of words based on its full context which it is in. For example, imagine the two sentences “The cat didn’t eat the food because it wasn’t hungry” and “The cat didn’t eat the food because it wasn’t fresh.” For a human, it is clear that the word “it” in the first sentence refers to the cat, and in the second sentence to the food. Because BERT looks at the context both ways (compared to traditional language models which only look one way, thus are one-directional), it too can successfully understand if “it” in this case refers to the cat or to the food.
- BERT uses attention. Meaning, it can handle long sentences of text more easily. For example, consider a text where the sentence “We have great neighbors where we live in Stockholm,” and a few lines further down, “On one occasion last year, we had a party for some of our closest friends and accidentally played the music a little bit too loud. They became really angry.” Previous language models would not be able to identify who “they” refers to due to the length of the sentence and distance between the words “neighbors” and “they.” BERT, on the other hand, understands that it’s the neighbors being referred to.
- A BERT model can be used for a lot of different purposes. Once trained and fine-tuned on relatively small amounts of labeled datasets to the domain-specific problems, BERT can be used for a lot of different purposes achieving high accuracy very quickly.
Pre-trained BERT now available on the Peltarion Platform
BERT is a complex model, with many stacked layers and different parameters - 12 stacked layers and 110 million parametres to be exact. Therefore, building and training a BERT model from scratch is very expensive, takes a lot of time and requires huge amounts of data. This is why we provide a pre-trained BERT model so that you only need to input your data and define the size of the vocabulary you want to use.
What used to be an expensive and complex process requiring weeks and weeks of training and knowledge of which specific code libraries to use, is now a few clicks job to achieve. Using BERT on the Peltarion Platform gives you access to a pre-trained BERT model. This model is capable of providing very accurate results, even when training on a small dataset.
Being able to train an existing BERT model on some domain-specific topics with a small amount of labeled data and short training time, can enable numerous companies to build business value, quickly.
Get started using BERT
Currently, we provide you with a pre-trained BERT model for text classification, compatible with the English language. For recommendations on using our pre-trained BERT, check out this documentation.
There are a variety of different aspects that text classification can be done on:
...across different kinds of media:
...and different functions: