数据集:
kor_ner
任务:
标记分类语言:
ko计算机处理:
monolingual大小:
1K<n<10K语言创建人:
other批注创建人:
expert-generated源数据集:
original许可:
mit[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
Each row consists of the following fields:
Note that by design, the length of tokens , pos_tags , and ner_tags will always be identical.
pos_tags corresponds to the list below:
['SO', 'SS', 'VV', 'XR', 'VCP', 'JC', 'VCN', 'JKB', 'MM', 'SP', 'XSN', 'SL', 'NNP', 'NP', 'EP', 'JKQ', 'IC', 'XSA', 'EC', 'EF', 'SE', 'XPN', 'ETN', 'SH', 'XSV', 'MAG', 'SW', 'ETM', 'JKO', 'NNB', 'MAJ', 'NNG', 'JKV', 'JKC', 'VA', 'NR', 'JKG', 'VX', 'SF', 'JX', 'JKS', 'SN']
ner_tags correspond to the following:
["I", "O", "B_OG", "B_TI", "B_LC", "B_DT", "B_PS"]
The prefix B denotes the first item of a phrase, and an I denotes any non-initial word. In addition, OG represens an organization; TI , time; DT , date, and PS , person.
[More Information Needed]
[More Information Needed]
[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
Thanks to @jaketae for adding this dataset.