数据集:
bianet
A parallel news corpus in Turkish, Kurdish and English; Bianet collects 3,214 Turkish articles with their sentence-aligned Kurdish or English translations from the Bianet online newspaper.
3 languages, 3 bitexts total number of files: 6 total number of tokens: 2.25M total number of sentence fragments: 0.14M
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
CC-BY-SA-4.0
@InProceedings{ATAMAN18.6, author = {Duygu Ataman}, title = {Bianet: A Parallel News Corpus in Turkish, Kurdish and English}, booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)}, year = {2018}, month = {may}, date = {7-12}, location = {Miyazaki, Japan}, editor = {Jinhua Du and Mihael Arcan and Qun Liu and Hitoshi Isahara}, publisher = {European Language Resources Association (ELRA)}, address = {Paris, France}, isbn = {979-10-95546-15-3}, language = {english} }
Thanks to @param087 for adding this dataset.