数据集:
quoref
任务:
问答语言:
en计算机处理:
monolingual大小:
10K<n<100K语言创建人:
found批注创建人:
crowdsourced源数据集:
original许可:
cc-by-4.0Quoref is a QA dataset which tests the coreferential reasoning capability of reading comprehension systems. In this span-selection benchmark containing 24K questions over 4.7K paragraphs from Wikipedia, a system must resolve hard coreferences before selecting the appropriate span(s) in the paragraphs for answering questions.
An example of 'validation' looks as follows.
This example was too long and was cropped: { "answers": { "answer_start": [1633], "text": ["Frankie"] }, "context": "\"Frankie Bono, a mentally disturbed hitman from Cleveland, comes back to his hometown in New York City during Christmas week to ...", "id": "bfc3b34d6b7e73c0bd82a009db12e9ce196b53e6", "question": "What is the first name of the person who has until New Year's Eve to perform a hit?", "title": "Blast of Silence", "url": "https://en.wikipedia.org/wiki/Blast_of_Silence" }
The data fields are the same among all splits.
defaultname | train | validation |
---|---|---|
default | 19399 | 2418 |
@article{allenai:quoref, author = {Pradeep Dasigi and Nelson F. Liu and Ana Marasovic and Noah A. Smith and Matt Gardner}, title = {Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning}, journal = {arXiv:1908.05803v2 }, year = {2019}, }
Thanks to @lewtun , @patrickvonplaten , @thomwolf for adding this dataset.