数据集:
cosmos_qa
任务:
多项选择子任务:
multiple-choice-qa语言:
en计算机处理:
monolingual大小:
10K<n<100K语言创建人:
found批注创建人:
crowdsourced源数据集:
original预印本库:
arxiv:1909.00277许可:
cc-by-4.0Cosmos QA is a large-scale dataset of 35.6K problems that require commonsense-based reading comprehension, formulated as multiple-choice questions. It focuses on reading between the lines over a diverse collection of people's everyday narratives, asking questions concerning on the likely causes or effects of events that require reasoning beyond the exact text spans in the context
An example of 'validation' looks as follows.
This example was too long and was cropped: { "answer0": "If he gets married in the church he wo nt have to get a divorce .", "answer1": "He wants to get married to a different person .", "answer2": "He wants to know if he does nt like this girl can he divorce her ?", "answer3": "None of the above choices .", "context": "\"Do i need to go for a legal divorce ? I wanted to marry a woman but she is not in the same religion , so i am not concern of th...", "id": "3BFF0DJK8XA7YNK4QYIGCOG1A95STE##3180JW2OT5AF02OISBX66RFOCTG5J7##A2LTOS0AZ3B28A##Blog_56156##q1_a1##378G7J1SJNCDAAIN46FM2P7T6KZEW2", "label": 1, "question": "Why is this person asking about divorce ?" }
The data fields are the same among all splits.
defaultname | train | validation | test |
---|---|---|---|
default | 25262 | 2985 | 6963 |
As reported via email by Yejin Choi, the dataset is licensed under CC BY 4.0 license.
@inproceedings{huang-etal-2019-cosmos, title = "Cosmos {QA}: Machine Reading Comprehension with Contextual Commonsense Reasoning", author = "Huang, Lifu and Le Bras, Ronan and Bhagavatula, Chandra and Choi, Yejin", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)", month = nov, year = "2019", address = "Hong Kong, China", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/D19-1243", doi = "10.18653/v1/D19-1243", pages = "2391--2401", }
Thanks to @patrickvonplaten , @lewtun , @albertvillanova , @thomwolf for adding this dataset.