模型:
megantosh/flair-arabic-dialects-codeswitch-egy-lev
任务:
标记分类许可:
apache-2.0Pretrained Part-of-Speech tagging model built on a joint corpus written in Egyptian and Levantine (Jordanian, Lebanese, Palestinian, Syrian) dialects with code-switching of Egyptian Arabic and English. The model is trained using Flair (forward+backward)and fastText embeddings.
This sequence labeling model was pretrained on three corpora jointly:
from flair.data import Sentence from flair.models import SequenceTagger tagger = SequenceTagger.load("megantosh/flair-arabic-dialects-codeswitch-egy-lev") sentence = Sentence('عمرو عادلي أستاذ للاقتصاد السياسي المساعد في الجامعة الأمريكية بالقاهرة .') tagger.predict(sentence) for entity in sentence.get_spans('pos'): print(entity)
Due to the right-to-left in left-to-right context, some formatting errors might occur. and your code might appear like this , (link accessed on 2020-10-27)
precision | recall | f1-score | support | |
---|---|---|---|---|
INTJ | 0.8182 | 0.9000 | 0.8571 | 10 |
OUN | 0.9009 | 0.9402 | 0.9201 | 435 |
NUM | 0.9524 | 0.8333 | 0.8889 | 24 |
ADJ | 0.8762 | 0.7603 | 0.8142 | 121 |
ADP | 0.9903 | 0.9623 | 0.9761 | 106 |
CCONJ | 0.9600 | 0.9730 | 0.9664 | 74 |
PROPN | 0.9333 | 0.9333 | 0.9333 | 15 |
ADV | 0.9135 | 0.8051 | 0.8559 | 118 |
VERB | 0.8852 | 0.9231 | 0.9038 | 117 |
PRON | 0.9620 | 0.9465 | 0.9542 | 187 |
SCONJ | 0.8571 | 0.9474 | 0.9000 | 19 |
PART | 0.9350 | 0.9791 | 0.9565 | 191 |
DET | 0.9348 | 0.9149 | 0.9247 | 47 |
PUNCT | 1.0000 | 1.0000 | 1.0000 | 35 |
AUX | 0.9286 | 0.9811 | 0.9541 | 53 |
MENTION | 0.9231 | 1.0000 | 0.9600 | 12 |
V | 0.8571 | 0.8780 | 0.8675 | 82 |
FUT-PART+V+PREP+PRON | 1.0000 | 0.0000 | 0.0000 | 1 |
PROG-PART+V+PRON+PREP+PRON | 0.0000 | 1.0000 | 0.0000 | 0 |
ADJ+NSUFF | 0.6111 | 0.8462 | 0.7097 | 26 |
NOUN+NSUFF | 0.8182 | 0.8438 | 0.8308 | 64 |
PREP+PRON | 0.9565 | 0.9565 | 0.9565 | 23 |
PUNC | 0.9941 | 1.0000 | 0.9971 | 169 |
EOS | 1.0000 | 1.0000 | 1.0000 | 70 |
NOUN+PRON | 0.6986 | 0.8500 | 0.7669 | 60 |
V+PRON | 0.7258 | 0.8036 | 0.7627 | 56 |
PART+PRON | 1.0000 | 0.9474 | 0.9730 | 19 |
PROG-PART+V | 0.8333 | 0.9302 | 0.8791 | 43 |
DET+NOUN | 0.9625 | 1.0000 | 0.9809 | 77 |
NOUN+NSUFF+PRON | 0.9091 | 0.7143 | 0.8000 | 14 |
PROG-PART+V+PRON | 0.7083 | 0.9444 | 0.8095 | 18 |
PREP+NOUN+NSUFF | 0.6667 | 0.4000 | 0.5000 5 | |
NOUN+NSUFF+NSUFF | 1.0000 | 0.0000 | 0.0000 | 3 |
CONJ | 0.9722 | 1.0000 | 0.9859 | 35 |
V+PRON+PRON | 0.6364 | 0.5833 | 0.6087 | 12 |
FOREIGN | 0.6667 | 0.6667 | 0.6667 | 3 |
PREP+NOUN | 0.6316 | 0.7500 | 0.6857 | 16 |
DET+NOUN+NSUFF | 0.9000 | 0.9310 | 0.9153 | 29 |
DET+ADJ+NSUFF | 1.0000 | 0.5714 | 0.7273 | 7 |
CONJ+PRON | 1.0000 | 0.8750 | 0.9333 | 8 |
NOUN+CASE | 0.0000 | 0.0000 | 0.0000 | 2 |
DET+ADJ | 1.0000 | 0.6667 | 0.8000 | 6 |
PREP | 1.0000 | 0.9718 | 0.9857 | 71 |
CONJ+FUT-PART+V | 0.0000 | 0.0000 | 0.0000 | 1 |
CONJ+V | 0.6667 | 0.7500 | 0.7059 | 8 |
FUT-PART | 1.0000 | 1.0000 | 1.0000 | 2 |
ADJ+PRON | 1.0000 | 0.0000 | 0.0000 | 8 |
CONJ+PREP+NOUN+PRON | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+NOUN+PRON | 0.3750 | 1.0000 | 0.5455 | 3 |
PART+ADJ | 1.0000 | 0.0000 | 0.0000 | 1 |
PART+NOUN | 0.5000 | 1.0000 | 0.6667 | 1 |
CONJ+PREP+NOUN | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+NOUN | 0.7000 | 0.7778 | 0.7368 | 9 |
URL | 1.0000 | 1.0000 | 1.0000 | 3 |
CONJ+FUT-PART | 1.0000 | 0.0000 | 0.0000 | 1 |
FUT-PART+V | 0.8571 | 0.6000 | 0.7059 | 10 |
PREP+NOUN+NSUFF+NSUFF | 1.0000 | 0.0000 | 0.0000 | 1 |
HASH | 1.0000 | 0.9412 | 0.9697 | 17 |
ADJ+PREP+PRON | 1.0000 | 0.0000 | 0.0000 | 3 |
PREP+NOUN+PRON | 0.0000 | 0.0000 | 0.0000 | 1 |
EMOT | 1.0000 | 0.8889 | 0.9412 | 18 |
CONJ+PREP | 1.0000 | 0.7500 | 0.8571 | 4 |
PREP+DET+NOUN+NSUFF | 1.0000 | 0.7500 | 0.8571 | 4 |
PRON+DET+NOUN+NSUFF | 0.0000 | 1.0000 | 0.0000 | 0 |
V+PREP+PRON | 1.0000 | 0.0000 | 0.0000 | 5 |
V+PRON+PREP+PRON | 0.0000 | 1.0000 | 0.0000 | 0 |
CONJ+NOUN+NSUFF | 0.5000 | 0.5000 | 0.5000 | 2 |
V+NEG-PART | 1.0000 | 0.0000 | 0.0000 | 2 |
PREP+DET+NOUN | 0.9091 | 1.0000 | 0.9524 | 10 |
PREP+V | 1.0000 | 0.0000 | 0.0000 | 2 |
CONJ+PART | 1.0000 | 0.7778 | 0.8750 | 9 |
CONJ+V+PRON | 1.0000 | 1.0000 | 1.0000 | 5 |
PROG-PART+V+PREP+PRON | 1.0000 | 0.5000 | 0.6667 | 2 |
PREP+NOUN+NSUFF+PRON | 1.0000 | 1.0000 | 1.0000 | 1 |
ADJ+CASE | 1.0000 | 0.0000 | 0.0000 | 1 |
PART+NOUN+PRON | 1.0000 | 1.0000 | 1.0000 | 1 |
PART+V | 1.0000 | 0.0000 | 0.0000 | 3 |
PART+V+PRON | 0.0000 | 1.0000 | 0.0000 | 0 |
FUT-PART+V+PRON | 0.0000 | 1.0000 | 0.0000 | 0 |
FUT-PART+V+PRON+PRON | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+PREP+PRON | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+V+PRON+PREP+PRON | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+V+PREP+PRON | 0.0000 | 1.0000 | 0.0000 | 0 |
CONJ+DET+NOUN+NSUFF | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+DET+NOUN | 0.6667 | 1.0000 | 0.8000 | 2 |
CONJ+PREP+DET+NOUN | 1.0000 | 1.0000 | 1.0000 | 1 |
PREP+PART | 1.0000 | 0.0000 | 0.0000 | 2 |
PART+V+PRON+NEG-PART | 0.3333 | 0.3333 | 0.3333 | 3 |
PART+V+NEG-PART | 0.3333 | 0.5000 | 0.4000 | 2 |
PART+PREP+NEG-PART | 1.0000 | 1.0000 | 1.0000 | 3 |
PART+PROG-PART+V+NEG-PART | 1.0000 | 0.3333 | 0.5000 | 3 |
PREP+DET+NOUN+NSUFF+PREP+PRON | 1.0000 | 0.0000 | 0.0000 | 1 |
PREP+PRON+DET+NOUN | 0.0000 | 1.0000 | 0.0000 | 0 |
PART+NSUFF | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+PROG-PART+V+PRON | 1.0000 | 1.0000 | 1.0000 | 1 |
PART+PREP+PRON | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+PART+PREP | 1.0000 | 0.0000 | 0.0000 | 1 |
NUM+NSUFF | 0.6667 | 0.6667 | 0.6667 | 3 |
CONJ+PART+V+PRON+NEG-PART | 1.0000 | 1.0000 | 1.0000 | 1 |
PART+NOUN+NEG-PART | 1.0000 | 1.0000 | 1.0000 | 1 |
CONJ+ADJ+NSUFF | 1.0000 | 0.0000 | 0.0000 | 1 |
PREP+ADJ | 1.0000 | 0.0000 | 0.0000 | 1 |
ADJ+NSUFF+PRON | 1.0000 | 0.0000 | 0.0000 | 2 |
CONJ+PROG-PART+V | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+PART+PROG-PART+V+PREP+PRON+NEG-PART | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+PART+PREP+PRON+NEG-PART | 0.0000 | 1.0000 | 0.0000 | 0 |
PREP+PART+PRON | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+ADV+NSUFF | 1.0000 | 0.0000 | 0.0000 | 1 |
CONJ+ADV | 0.0000 | 1.0000 | 0.0000 | 0 |
PART+NOUN+PRON+NEG-PART | 0.0000 | 1.0000 | 0.0000 | 0 |
CONJ+ADJ | 1.0000 | 1.0000 | 1.0000 | 1 |
Expand details below to show class scores for each tag. Note that tag compounds (a tag made for multiple agglutinated parts of speech) are considered as separate ones.
if you use this model, please consider citing this work :
@unpublished{MMHU21 author = "M. Megahed", title = "Sequence Labeling Architectures in Diglossia", year = {2021}, doi = "10.13140/RG.2.2.34961.10084" url = {https://www.researchgate.net/publication/358956953_Sequence_Labeling_Architectures_in_Diglossia_-_a_case_study_of_Arabic_and_its_dialects} }