数据集:
openbookqa
任务:
问答子任务:
open-domain-qa语言:
en计算机处理:
monolingual大小:
1K<n<10K语言创建人:
expert-generated源数据集:
original许可:
license:unknownOpenBookQA aims to promote research in advanced question-answering, probing a deeper understanding of both the topic (with salient facts summarized as an open book, also provided with the dataset) and the language it is expressed in. In particular, it contains questions that require multi-step reasoning, use of additional common and commonsense knowledge, and rich text comprehension. OpenBookQA is a new kind of question-answering dataset modeled after open book exams for assessing human understanding of a subject.
An example of 'train' looks as follows:
{'id': '7-980', 'question_stem': 'The sun is responsible for', 'choices': {'text': ['puppies learning new tricks', 'children growing up and getting old', 'flowers wilting in a vase', 'plants sprouting, blooming and wilting'], 'label': ['A', 'B', 'C', 'D']}, 'answerKey': 'D'}additional
An example of 'train' looks as follows:
{'id': '7-980', 'question_stem': 'The sun is responsible for', 'choices': {'text': ['puppies learning new tricks', 'children growing up and getting old', 'flowers wilting in a vase', 'plants sprouting, blooming and wilting'], 'label': ['A', 'B', 'C', 'D']}, 'answerKey': 'D', 'fact1': 'the sun is the source of energy for physical cycles on Earth', 'humanScore': 1.0, 'clarity': 2.0, 'turkIdAnonymized': 'b356d338b7'}
The data fields are the same among all splits.
mainname | train | validation | test |
---|---|---|---|
main | 4957 | 500 | 500 |
additional | 4957 | 500 | 500 |
@inproceedings{OpenBookQA2018, title={Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering}, author={Todor Mihaylov and Peter Clark and Tushar Khot and Ashish Sabharwal}, booktitle={EMNLP}, year={2018} }
Thanks to @thomwolf , @patrickvonplaten , @lewtun for adding this dataset.