数据集:
iwslt2017
任务:
翻译计算机处理:
translation大小:
1M<n<10M语言创建人:
expert-generated批注创建人:
crowdsourced源数据集:
original许可:
cc-by-nc-nd-4.0The IWSLT 2017 Multilingual Task addresses text translation, including zero-shot translation, with a single MT system across all directions including English, German, Dutch, Italian and Romanian. As unofficial task, conventional bilingual text translation is offered between English and Arabic, French, Japanese, Chinese, German and Korean.
An example of 'train' looks as follows.
This example was too long and was cropped: { "translation": "{\"ar\": \"لقد طرت في \\\"القوات الجوية \\\" لمدة ثمان سنوات. والآن أجد نفسي مضطرا لخلع حذائي قبل صعود الطائرة!\", \"en\": \"I flew on Air ..." }iwslt2017-de-en
An example of 'train' looks as follows.
{ "translation": { "de": "Es ist mir wirklich eine Ehre, zweimal auf dieser Bühne stehen zu dürfen. Tausend Dank dafür.", "en": "And it's truly a great honor to have the opportunity to come to this stage twice; I'm extremely grateful." } }iwslt2017-en-ar
An example of 'train' looks as follows.
This example was too long and was cropped: { "translation": "{\"ar\": \"لقد طرت في \\\"القوات الجوية \\\" لمدة ثمان سنوات. والآن أجد نفسي مضطرا لخلع حذائي قبل صعود الطائرة!\", \"en\": \"I flew on Air ..." }iwslt2017-en-de
An example of 'validation' looks as follows.
{ "translation": { "de": "Die nächste Folie, die ich Ihnen zeige, ist eine Zeitrafferaufnahme was in den letzten 25 Jahren passiert ist.", "en": "The next slide I show you will be a rapid fast-forward of what's happened over the last 25 years." } }iwslt2017-en-fr
An example of 'validation' looks as follows.
{ "translation": { "en": "But this understates the seriousness of this particular problem because it doesn't show the thickness of the ice.", "fr": "Mais ceci tend à amoindrir le problème parce qu'on ne voit pas l'épaisseur de la glace." } }
The data fields are the same among all splits.
iwslt2017-ar-enname | train | validation | test |
---|---|---|---|
iwslt2017-ar-en | 231713 | 888 | 8583 |
iwslt2017-de-en | 206112 | 888 | 8079 |
iwslt2017-en-ar | 231713 | 888 | 8583 |
iwslt2017-en-de | 206112 | 888 | 8079 |
iwslt2017-en-fr | 232825 | 890 | 8597 |
Creative Commons BY-NC-ND
See the (TED Talks Usage Policy)[ https://www.ted.com/about/our-organization/our-policies-terms/ted-talks-usage-policy] .
@inproceedings{cettolo-etal-2017-overview, title = "Overview of the {IWSLT} 2017 Evaluation Campaign", author = {Cettolo, Mauro and Federico, Marcello and Bentivogli, Luisa and Niehues, Jan and St{\"u}ker, Sebastian and Sudoh, Katsuhito and Yoshino, Koichiro and Federmann, Christian}, booktitle = "Proceedings of the 14th International Conference on Spoken Language Translation", month = dec # " 14-15", year = "2017", address = "Tokyo, Japan", publisher = "International Workshop on Spoken Language Translation", url = "https://aclanthology.org/2017.iwslt-1.1", pages = "2--14", }