数据集:
sciq
任务:
子任务:
closed-domain-qa语言:
计算机处理:
monolingual大小:
10K<n<100K语言创建人:
crowdsourced批注创建人:
no-annotation源数据集:
original许可:
SciQ 数据集包含13,679个众包科学考试问题,涉及物理学、化学和生物学等多个科目。这些问题以多选题的形式提供,每个问题有4个答案选项。对于大多数问题,还提供支持正确答案的附加段落。
"train" 的示例如下所示。
This example was too long and was cropped:
{
"correct_answer": "coriolis effect",
"distractor1": "muon effect",
"distractor2": "centrifugal effect",
"distractor3": "tropical effect",
"question": "What phenomenon makes global winds blow northeast to southwest or the reverse in the northern hemisphere and northwest to southeast or the reverse in the southern hemisphere?",
"support": "\"Without Coriolis Effect the global winds would blow north to south or south to north. But Coriolis makes them blow northeast to..."
}
所有拆分的数据字段都是相同的。
默认| name | train | validation | test |
|---|---|---|---|
| default | 11679 | 1000 | 1000 |
该数据集在 Creative Commons Attribution-NonCommercial 3.0 Unported License 下获得许可。
@inproceedings{SciQ,
title={Crowdsourcing Multiple Choice Science Questions},
author={Johannes Welbl, Nelson F. Liu, Matt Gardner},
year={2017},
journal={arXiv:1707.06209v1}
}
感谢 @patrickvonplaten , @lewtun , @thomwolf 为添加此数据集。