Multi-lingual BERT Bengali Name Entity Recognition
mBERT-Bengali-NER
is a transformer-based Bengali NER model build with
bert-base-multilingual-uncased
model and
Wikiann
Datasets.
How to Use
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("sagorsarker/mbert-bengali-ner")
model = AutoModelForTokenClassification.from_pretrained("sagorsarker/mbert-bengali-ner")
nlp = pipeline("ner", model=model, tokenizer=tokenizer, grouped_entities=True)
example = "আমি জাহিদ এবং আমি ঢাকায় বাস করি।"
ner_results = nlp(example)
print(ner_results)
Label and ID Mapping
Label ID
|
Label
|
0
|
O
|
1
|
B-PER
|
2
|
I-PER
|
3
|
B-ORG
|
4
|
I-ORG
|
5
|
B-LOC
|
6
|
I-LOC
|
Training Details
Evaluation Results
Model
|
F1
|
Precision
|
Recall
|
Accuracy
|
Loss
|
mBert-Bengali-NER
|
0.97105
|
0.96769
|
0.97443
|
0.97682
|
0.12511
|