Sentence XLM-R
XLM-RoBERTa
Note
|
Disclaimer Please note that datasets, machine-learning models, weights, topologies, research papers and other content, including open source software, (collectively referred to as “Content”) provided and/or suggested by Peltarion for use in the Platform and otherwise, may be subject to separate third party terms of use or license terms. You are solely responsible for complying with the applicable terms. Peltarion makes no representations or warranties about Content. You expressly relieve us from any and all liability, loss or risk arising (directly or indirectly) from Your use of any third party content. |
References
-
Nils Reimers, Iryna Gurevych: Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation, 2020.
-
Alexis Conneau, Kartikay Khandelwal, et al.: Unsupervised Cross-lingual Representation Learning at Scale, 2020.
-
Guillaume Wenzek, Marie-Anne Lachaux, et al.: CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data, 2019.
License
The Sentence XLM-R block uses the xlm-r-100langs-bert-base-nli-stsb-mean-tokens model with weights, pre-trained by Hugging face on 100 languages from CommonCrawl, SNLI, and STS-b.