NLP models

The Peltarion platform enables a wide range of text solutions through our integrated language models.

02/ BERT - State of the art NLP

BERT (Bidirectional Encoder Representations from Transformers) pushed the state of the art in NLP by combining two powerful technologies:

  • Transformer encoder network, a type of network that can process long texts efficiently by using self-attention.
  • Bidirectional, meaning that it uses the whole text passage to understand the meaning of each word.

BERT comes pretrained on the platform, so that you can use it with minimal training work.

03/ Multilingual BERT - 100 languages

The Multilingual BERT allows you to deploy a single model able to work with any of the 100 known languages.

Why use a multilingual model?

More than a simple convenience, multilingual models often perform better than monolingual models.

One reason is that the training data available is generally more limited in any single language. In addition, many languages share common patterns that the model can pick up more easily when it is trained with a variety of languages.

04/ Sentence XLM-R - Similarity expert & 100 languages

The Sentence XLM-R block provides good Natural Language Processing capabilities for 100 languages, and is especially trained for text similarity applications.

Sentence XLM-R is a good model for processing text written naturally and in many languages.

It is pretrained more specifically for sentence embedding, making it a better choice for text similarity tasks out of the box.

05/ USE - Fast similarity expert

The multilingual USE (Universal Sentence EncoderI) is a model that can process text in 16 languages and produce embeddings that are suitable for semantic text similarity tasks.

The Universal sentence encoder block runs faster than the Sentence XLM-R, especially for longer text. Sometimes, though, it can be less accurate.

USE is fine-tuned for text similarity, allowing you to deploy the model without requiring any training.

06/ GPT-3 - proprietary kickass model

GPT-3 is an exclusively licensed Microsoft language model released in may 2020, by OpenAI. 

GPT-3 can do amazing things, but it requires skills, a lot of power and the right infrastructure to use. It’s only trained on what is “commonly known”, that is,. NOT your domain specific tasks.

It’s still quite bad at some language tasks like, comparing two sentences etc.

GPT-3 vs BERT?

  • GPT-3 is your kickass friend who helps you out in anything but is not the expert at what you do. And not available.
  • BERT is your colleague who really tries to be the best at solving the task you work with. And it's available!

