数据集:
mwsc
任务:
多项选择语言:
en计算机处理:
monolingual大小:
n<1K语言创建人:
expert-generated批注创建人:
expert-generated源数据集:
extended|winograd_wsc预印本库:
arxiv:1806.08730许可:
cc-by-4.0Examples taken from the Winograd Schema Challenge modified to ensure that answers are a single word from the context. This Modified Winograd Schema Challenge (MWSC) ensures that scores are neither inflated nor deflated by oddities in phrasing.
An example looks as follows:
{ "sentence": "The city councilmen refused the demonstrators a permit because they feared violence.", "question": "Who feared violence?", "options": [ "councilmen", "demonstrators" ], "answer": "councilmen" }
The data fields are the same among all splits.
defaultname | train | validation | test |
---|---|---|---|
default | 80 | 82 | 100 |
Our code for running decaNLP has been open sourced under BSD-3-Clause.
We chose to restrict decaNLP to datasets that were free and publicly accessible for research, but you should check their individual terms if you deviate from this use case.
From the Winograd Schema Challenge :
Both versions of the collections are licenced under a Creative Commons Attribution 4.0 International License .
If you use this in your work, please cite:
@article{McCann2018decaNLP, title={The Natural Language Decathlon: Multitask Learning as Question Answering}, author={Bryan McCann and Nitish Shirish Keskar and Caiming Xiong and Richard Socher}, journal={arXiv preprint arXiv:1806.08730}, year={2018} }
Thanks to @thomwolf , @lewtun , @ghomasHudson , @lhoestq for adding this dataset.