数据集:
flores
任务:
计算机处理:
translation大小:
1K<n<10K语言创建人:
found批注创建人:
found预印本库:
arxiv:1902.01382许可:
Evaluation datasets for low-resource machine translation: Nepali-English and Sinhala-English.
An example of 'validation' looks as follows.
This example was too long and was cropped:
{
    "translation": "{\"en\": \"This is the wrong translation!\", \"ne\": \"यस वाहेक आगम पूजा, तारा पूजा, व्रत आदि पनि घरभित्र र वाहिर दुवै स्थानमा गरेको पा..."
}
 sien
 An example of 'validation' looks as follows.
This example was too long and was cropped:
{
    "translation": "{\"en\": \"This is the wrong translation!\", \"si\": \"එවැනි ආවරණයක් ලබාදීමට රක්ෂණ සපයන්නෙකු කැමති වුවත් ඒ සාමාන් යයෙන් බොහෝ රටවල පොදු ..."
}
 The data fields are the same among all splits.
neen| name | validation | test | 
|---|---|---|
| neen | 2560 | 2836 | 
| sien | 2899 | 2767 | 
@misc{guzmn2019new,
    title={Two New Evaluation Datasets for Low-Resource Machine Translation: Nepali-English and Sinhala-English},
    author={Francisco Guzman and Peng-Jen Chen and Myle Ott and Juan Pino and Guillaume Lample and Philipp Koehn and Vishrav Chaudhary and Marc'Aurelio Ranzato},
    year={2019},
    eprint={1902.01382},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
 Thanks to @thomwolf , @patrickvonplaten , @lewtun for adding this dataset.