模型:

microsoft/xlm-align-base

中文

XLM-Align

XLM-Align (ACL 2021, paper , repo , model ) Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment

XLM-Align is a pretrained cross-lingual language model that supports 94 languages. See details in our paper .

Example

model = AutoModel.from_pretrained("microsoft/xlm-align-base")

Evaluation Results

XTREME cross-lingual understanding tasks:

Model POS NER XQuAD MLQA TyDiQA XNLI PAWS-X Avg
XLM-R_base 75.6 61.8 71.9 / 56.4 65.1 / 47.2 55.4 / 38.3 75.0 84.9 66.4
XLM-Align 76.0 63.7 74.7 / 59.0 68.1 / 49.8 62.1 / 44.8 76.2 86.8 68.9

MD5

b9d214025837250ede2f69c9385f812c  config.json
6005db708eb4bab5b85fa3976b9db85b  pytorch_model.bin
bf25eb5120ad92ef5c7d8596b5dc4046  sentencepiece.bpe.model
eedbd60a7268b9fc45981b849664f747  tokenizer.json

About

Contact: chizewen@outlook.com

BibTeX:

@inproceedings{xlmalign,
  title = "Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment",
  author={Zewen Chi and Li Dong and Bo Zheng and Shaohan Huang and Xian-Ling Mao and Heyan Huang and Furu Wei},
  booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
  month = aug,
  year = "2021",
  address = "Online",
  publisher = "Association for Computational Linguistics",
  url = "https://aclanthology.org/2021.acl-long.265",
  doi = "10.18653/v1/2021.acl-long.265",
  pages = "3418--3430",}