数据集:
turkish_ner
任务:
标记分类语言:
tr计算机处理:
monolingual大小:
100K<n<1M语言创建人:
expert-generated批注创建人:
machine-generated源数据集:
original预印本库:
arxiv:1702.02363许可:
cc-by-4.0Automatically annotated Turkish corpus for named entity recognition and text categorization using large-scale gazetteers. The constructed gazetteers contains approximately 300K entities with thousands of fine-grained entity types under 25 different domains.
[Needs More Information]
Turkish
[More Information Needed]
[More Information Needed]
There's only the training set.
[More Information Needed]
[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
H. Bahadir Sahin, Caglar Tirkaz, Eray Yildiz, Mustafa Tolga Eren and Omer Ozan Sonmez
Creative Commons Attribution 4.0 International
@InProceedings@article{DBLP:journals/corr/SahinTYES17, author = {H. Bahadir Sahin and Caglar Tirkaz and Eray Yildiz and Mustafa Tolga Eren and Omer Ozan Sonmez}, title = {Automatically Annotated Turkish Corpus for Named Entity Recognition and Text Categorization using Large-Scale Gazetteers}, journal = {CoRR}, volume = {abs/1702.02363}, year = {2017}, url = { http://arxiv.org/abs/1702.02363} , archivePrefix = {arXiv}, eprint = {1702.02363}, timestamp = {Mon, 13 Aug 2018 16:46:36 +0200}, biburl = { https://dblp.org/rec/journals/corr/SahinTYES17.bib} , bibsource = {dblp computer science bibliography, https://dblp.org} }
Thanks to @merveenoyan for adding this dataset.