数据集:
winograd_wsc
任务:
多项选择语言:
en计算机处理:
monolingual大小:
n<1K语言创建人:
expert-generated批注创建人:
expert-generated源数据集:
original许可:
cc-by-4.0A Winograd schema is a pair of sentences that differ in only one or two words and that contain an ambiguity that is resolved in opposite ways in the two sentences and requires the use of world knowledge and reasoning for its resolution. The schema takes its name from a well-known example by Terry Winograd:
The city councilmen refused the demonstrators a permit because they [feared/advocated] violence.
If the word is feared'', then they'' presumably refers to the city council; if it is advocated'' then they'' presumably refers to the demonstrators.
From the official webpage:
A contest, entitled the Winograd Schema Challenge was run once, in 2016. At that time, there was a cash prize offered for achieving human-level performance in the contest. Since then, the sponsor has withdrawn; therefore NO CASH PRIZES CAN BE OFFERED OR WILL BE AWARDED FOR ANY KIND OF PERFORMANCE OR ACHIEVEMENT ON THIS CHALLENGE.
The dataset is in English.
Translation of 12 WSs into Chinese (translated by Wei Xu).
Translations into Japanese, by Soichiro Tanaka, Rafal Rzepka, and Shiho Katajima **Translation changing English names to Japanese ** PDF HTML Translation preserving English names PDF HTML
Translation into French, by Pascal Amsili and Olga Seminck
Winograd Schemas in Portuguese by Gabriela Melo, Vinicius Imaizumi, and Fábio Cozman.
Mandarinograd: A Chinese Collection of Winograd Schemas by Timothée Bernard and Ting Han, LREC-2020.
Each instance contains a text passage with a designated pronoun and two possible answers indicating which entity in the passage the pronoun represents. An example instance looks like the following:
{ 'label': 0, 'options': ['The city councilmen', 'The demonstrators'], 'pronoun': 'they', 'pronoun_loc': 63, 'quote': 'they feared violence', 'quote_loc': 63, 'source': '(Winograd 1972)', 'text': 'The city councilmen refused the demonstrators a permit because they feared violence.' }
Only a test split is included.
The Winograd Schema Challenge was proposed as an automated evaluation of an AI system's commonsense linguistic understanding. From the webpage:
The strengths of the challenge are that it is clear-cut, in that the answer to each schema is a binary choice; vivid, in that it is obvious to non-experts that a program that fails to get the right answers clearly has serious gaps in its understanding; and difficult, in that it is far beyond the current state of the art.
This data was manually written by experts such that the schemas are:
easily disambiguated by the human reader (ideally, so easily that the reader does not even notice that there is an ambiguity);
not solvable by simple techniques such as selectional restrictions;
Google-proof; that is, there is no obvious statistical test over text corpora that will reliably disambiguate these correctly.
This dataset has grown over time, and so was produced by a variety of lingustic and AI researchers. See the source field for the source of each instance.
Annotations are produced by the experts who construct the examples.
Who are the annotators?See above.
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
This dataset has grown over time, and so was produced by a variety of lingustic and AI researchers. See the source field for the source of each instance.
This work is licensed under a Creative Commons Attribution 4.0 International License .
The Winograd Schema Challenge including many of the examples here was proposed by Levesque et al 2012 :
@inproceedings{levesque2012winograd, title={The winograd schema challenge}, author={Levesque, Hector and Davis, Ernest and Morgenstern, Leora}, booktitle={Thirteenth International Conference on the Principles of Knowledge Representation and Reasoning}, year={2012}, organization={Citeseer} }
Thanks to @joeddav for adding this dataset.