数据集:

bigbio/scitail

语言:

en

计算机处理:

monolingual

许可:

apache-2.0
中文

Dataset Card for SciTail

The SciTail dataset is an entailment dataset created from multiple-choice science exams and web sentences. Each question and the correct answer choice are converted into an assertive statement to form the hypothesis. We use information retrieval to obtain relevant text from a large text corpus of web sentences, and use these sentences as a premise P. We crowd source the annotation of such premise-hypothesis pair as supports (entails) or not (neutral), in order to create the SciTail dataset. The dataset contains 27,026 examples with 10,101 examples with entails label and 16,925 examples with neutral label.

Citation Information

@inproceedings{scitail,
    author = {Tushar Khot and Ashish Sabharwal and Peter Clark},
    booktitle = {AAAI}
    title = {SciTail: A Textual Entailment Dataset from Science Question Answering},
    year = {2018}