数据集:
projecte-aina/wnli-ca
任务:
文本分类语言:
ca计算机处理:
monolingual语言创建人:
found批注创建人:
expert-generated源数据集:
extended|glue许可:
cc-by-4.0"A Winograd schema is a pair of sentences that differ in only one or two words and that contain an ambiguity that is resolved in opposite ways in the two sentences and requires the use of world knowledge and reasoning for its resolution. The schema takes its name from Terry Winograd." Source: The Winograd Schema Challenge .
The Winograd NLI dataset presents 855 sentence pairs, in which the first sentence contains an ambiguity and the second one a possible interpretation of it. The label indicates if the interpretation is correct (1) or not (0).
This dataset is a professional translation into Catalan of Winograd NLI dataset as published in GLUE Benchmark .
Both the original dataset and this translation are licenced under a Creative Commons Attribution 4.0 International License .
Textual entailment, Text classification, Language Model.
The dataset is in Catalan ( ca-CA )
Three tsv files.
index | sentence 1 | sentence 2 | label |
---|---|---|---|
0 | Vaig clavar una agulla en una pastanaga. Quan la vaig treure, tenia un forat. | La pastanaga tenia un forat. | 1 |
1 | En Joan no podia veure l’escenari amb en Guillem davant seu perquè és molt baix. | En Joan és molt baix. | 1 |
2 | Els policies van arrestar tots els membres de la banda. Volien aturar el tràfic de drogues del barri. | Els policies volien aturar el tràfic de drogues del barri. | 1 |
3 | L’Esteve segueix els passos d’en Frederic en tot. L’influencia moltíssim. | L’Esteve l’influencia moltíssim. | 0 |
We translated this dataset to contribute to the development of language models in Catalan, a low-resource language, and to allow inter-lingual comparisons.
This is a professional translation of WNLI dataset into Catalan, commissioned by BSC TeMU within the Projecte AINA .
For more information on how the Winograd NLI dataset was created, visit the webpage The Winograd Schema Challenge .
Who are the source language producers?For more information on how the Winograd NLI dataset was created, visit the webpage The Winograd Schema Challenge .
We comissioned a professional translation of WNLI dataset into Catalan.
Who are the annotators?Translation was commisioned to a professional translator.
No personal or sensitive information included.
This dataset contributes to the development of language models in Catalan, a low-resource language.
[N/A]
[N/A]
Text Mining Unit (TeMU) at the Barcelona Supercomputing Center ( bsc-temu@bsc.es ).
This work was funded by the Departament de la Vicepresidència i de Polítiques Digitals i Territori de la Generalitat de Catalunya within the framework of Projecte AINA .
This work is licensed under a CC Attribution 4.0 International License .
[N/A]