数据集:

igbo_ner

语言:

ig

计算机处理:

monolingual

大小:

10K<n<100K

语言创建人:

found

批注创建人:

found

源数据集:

original

预印本库:

arxiv:2004.00648
中文

Dataset Card for Igbo NER dataset

Dataset Summary

[More Information Needed]

Supported Tasks and Leaderboards

[More Information Needed]

Languages

[More Information Needed]

Dataset Structure

Data Instances

Here is an example from the dataset:

{'content_n': 'content_0', 'named_entity': 'Ike Ekweremmadụ', 'sentences': ['Ike Ekweremmadụ', "Ike ịda jụụ otụ nkeji banyere oke ogbugbu na-eme n'ala Naijiria agwụla Ekweremmadụ"]}

Data Fields

  • content_n : ID
  • named_entity : Name of the entity
  • sentences : List of sentences for the entity

Data Splits

[More Information Needed]

Dataset Creation

Curation Rationale

[More Information Needed]

Source Data

Initial Data Collection and Normalization

[More Information Needed]

Who are the source language producers?

[More Information Needed]

Annotations

Annotation process

[More Information Needed]

Who are the annotators?

[More Information Needed]

Personal and Sensitive Information

[More Information Needed]

Considerations for Using the Data

Social Impact of Dataset

[More Information Needed]

Discussion of Biases

[More Information Needed]

Other Known Limitations

[More Information Needed]

Additional Information

Dataset Curators

[More Information Needed]

Licensing Information

[More Information Needed]

Citation Information

@misc{ezeani2020igboenglish, title={Igbo-English Machine Translation: An Evaluation Benchmark}, author={Ignatius Ezeani and Paul Rayson and Ikechukwu Onyenwe and Chinedu Uchechukwu and Mark Hepple}, year={2020}, eprint={2004.00648}, archivePrefix={arXiv}, primaryClass={cs.CL} }

Contributions

Thanks to @purvimisal for adding this dataset.