数据集:
ai2_arc
任务:
问答语言:
en计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
found源数据集:
original许可:
cc-by-sa-4.0这是一个由7,787个真实的小学水平多项选择科学问题组成的新数据集,旨在鼓励高级问答研究。数据集分为挑战集和简单集,其中挑战集仅包含由检索算法和词共现算法均回答错误的问题。我们还提供一个包含超过1400万个与该任务相关的科学句子的语料库,以及针对该数据集的三个神经基线模型的实现。我们将ARC提出为一个对整个社区具有挑战性的任务。
"train"的示例如下所示。
{ "answerKey": "B", "choices": { "label": ["A", "B", "C", "D"], "text": ["Shady areas increased.", "Food sources increased.", "Oxygen levels increased.", "Available water increased."] }, "id": "Mercury_SC_405487", "question": "One year, the oak trees in a park began producing more acorns than usual. The next year, the population of chipmunks in the park also increased. Which best explains why there were more chipmunks the next year?" }ARC-简单
"train"的示例如下所示。
{ "answerKey": "B", "choices": { "label": ["A", "B", "C", "D"], "text": ["Shady areas increased.", "Food sources increased.", "Oxygen levels increased.", "Available water increased."] }, "id": "Mercury_SC_405487", "question": "One year, the oak trees in a park began producing more acorns than usual. The next year, the population of chipmunks in the park also increased. Which best explains why there were more chipmunks the next year?" }
所有拆分之间的数据字段相同。
ARC-挑战name | train | validation | test |
---|---|---|---|
ARC-Challenge | 1119 | 299 | 1172 |
ARC-Easy | 2251 | 570 | 2376 |
@article{allenai:arc, author = {Peter Clark and Isaac Cowhey and Oren Etzioni and Tushar Khot and Ashish Sabharwal and Carissa Schoenick and Oyvind Tafjord}, title = {Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge}, journal = {arXiv:1803.05457v1}, year = {2018}, }
感谢 @lewtun , @patrickvonplaten , @thomwolf 添加了此数据集。