数据集:

derek-thomas/ScienceQA

任务:

多项选择

问答

task_categories:other

子任务:

multiple-choice-qa closed-domain-qa open-domain-qa

语言:

计算机处理:

monolingual

大小:

10K<n<100K

语言创建人:

expert-generated found

批注创建人:

expert-generated found

源数据集:

original

预印本库:

arxiv:2209.09513

其他:

multi-modal-qa science chemistry

许可:

cc-by-sa-4.0

数据集介绍文件清单

中文

Dataset Card Creation Guide

Dataset Summary

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

Supported Tasks and Leaderboards

Multi-modal Multiple Choice

Languages

English

Dataset Structure

Data Instances

Explore more samples here .

{'image': Image,
 'question': 'Which of these states is farthest north?',
 'choices': ['West Virginia', 'Louisiana', 'Arizona', 'Oklahoma'],
 'answer': 0,
 'hint': '',
 'task': 'closed choice',
 'grade': 'grade2',
 'subject': 'social science',
 'topic': 'geography',
 'category': 'Geography',
 'skill': 'Read a map: cardinal directions',
 'lecture': 'Maps have four cardinal directions, or main directions. Those directions are north, south, east, and west.\nA compass rose is a set of arrows that point to the cardinal directions. A compass rose usually shows only the first letter of each cardinal direction.\nThe north arrow points to the North Pole. On most maps, north is at the top of the map.',
 'solution': 'To find the answer, look at the compass rose. Look at which way the north arrow is pointing. West Virginia is farthest north.'}

Some records might be missing any or all of image, lecture, solution.

Data Fields

image : Contextual image
question : Prompt relating to the lecture
choices : Multiple choice answer with 1 correct to the question
answer : Index of choices corresponding to the correct answer
hint : Hint to help answer the question
task : Task description
grade : Grade level from K-12
subject : High level
topic : natural-sciences, social-science, or language-science
category : A subcategory of topic
skill : A description of the task required
lecture : A relevant lecture that a question is generated from
solution : Instructions on how to solve the question

Note that the descriptions can be initialized with the Show Markdown Data Fields output of the Datasets Tagging app , you will then only need to refine the generated descriptions.

Data Splits

name: train
- num_bytes: 16416902
- num_examples: 12726
name: validation
- num_bytes: 5404896
- num_examples: 4241
name: test
- num_bytes: 5441676
- num_examples: 4241

Dataset Creation

Curation Rationale

When answering a question, humans utilize the information available across different modalities to synthesize a consistent and complete chain of thought (CoT). This process is normally a black box in the case of deep learning models like large-scale language models. Recently, science question benchmarks have been used to diagnose the multi-hop reasoning ability and interpretability of an AI system. However, existing datasets fail to provide annotations for the answers, or are restricted to the textual-only modality, small scales, and limited domain diversity. To this end, we present Science Question Answering (ScienceQA).

Source Data

ScienceQA is collected from elementary and high school science curricula.

Initial Data Collection and Normalization

See Below

Who are the source language producers?

See Below

Annotations

Questions in the ScienceQA dataset are sourced from open resources managed by IXL Learning, an online learning platform curated by experts in the field of K-12 education. The dataset includes problems that align with California Common Core Content Standards. To construct ScienceQA, we downloaded the original science problems and then extracted individual components (e.g. questions, hints, images, options, answers, lectures, and solutions) from them based on heuristic rules. We manually removed invalid questions, such as questions that have only one choice, questions that contain faulty data, and questions that are duplicated, to comply with fair use and transformative use of the law. If there were multiple correct answers that applied, we kept only one correct answer. Also, we shuffled the answer options of each question to ensure the choices do not follow any specific pattern. To make the dataset easy to use, we then used semi-automated scripts to reformat the lectures and solutions. Therefore, special structures in the texts, such as tables and lists, are easily distinguishable from simple text passages. Similar to ImageNet, ReClor, and PMR datasets, ScienceQA is available for non-commercial research purposes only and the copyright belongs to the original authors. To ensure data quality, we developed a data exploration tool to review examples in the collected dataset, and incorrect annotations were further manually revised by experts. The tool can be accessed at https://scienceqa.github.io/explore.html .

Annotation process

See above

Who are the annotators?

See above

Personal and Sensitive Information

More Information Needed

Considerations for Using the Data

Additional Information

Dataset Curators

Pan Lu1,3
Swaroop Mishra2,3
Tony Xia1
Liang Qiu1
Kai-Wei Chang1
Song-Chun Zhu1
Oyvind Tafjord3
Peter Clark3
Ashwin Kalyan3

From:

University of California, Los Angeles

Arizona State University

Allen Institute for AI

Licensing Information

Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

Citation Information

Provide the BibTex -formatted reference for the dataset. For example:

@inproceedings{lu2022learn,
    title={Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering},
    author={Lu, Pan and Mishra, Swaroop and Xia, Tony and Qiu, Liang and Chang, Kai-Wei and Zhu, Song-Chun and Tafjord, Oyvind and Clark, Peter and Ashwin Kalyan},
    booktitle={The 36th Conference on Neural Information Processing Systems (NeurIPS)},
    year={2022}
}

Contributions

Thanks to Derek Thomas @datavistics for adding this dataset.

作者:

derek-thomas

数据集大小:

597.47 MB