数据集:
turkish_shrinked_ner
任务:
标记分类语言:
tr计算机处理:
monolingual大小:
100K<n<1M语言创建人:
expert-generated批注创建人:
machine-generated许可:
cc-by-4.0Shrinked processed version (48 entity type) of the turkish_ner.
Original turkish_ner dataset: Automatically annotated Turkish corpus for named entity recognition and text categorization using large-scale gazetteers. The constructed gazetteers contains approximately 300K entities with thousands of fine-grained entity types under 25 different domains.
Shrinked entity types are: academic, academic_person, aircraft, album_person, anatomy, animal, architect_person, capital, chemical, clothes, country, culture, currency, date, food, genre, government, government_person, language, location, material, measure, medical, military, military_person, nation, newspaper, organization, organization_person, person, production_art_music, production_art_music_person, quantity, religion, science, shape, ship, software, space, space_person, sport, sport_name, sport_person, structure, subject, tech, train, vehicle
[Needs More Information]
Turkish
[Needs More Information]
[Needs More Information]
There's only the training set.
[Needs More Information]
[Needs More Information]
Who are the source language producers?[Needs More Information]
[Needs More Information]
Who are the annotators?[Needs More Information]
[Needs More Information]
[Needs More Information]
[Needs More Information]
[Needs More Information]
Behcet Senturk
Creative Commons Attribution 4.0 International
[Needs More Information]
Thanks to @bhctsntrk for adding this dataset.