数据集:
asnq
许可:
cc-by-nc-sa-3.0预印本库:
arxiv:1911.04118批注创建人:
crowdsourced语言创建人:
found大小:
10M<n<100M计算机处理:
monolingual语言:
en子任务:
multiple-choice-qa任务:
多项选择ASNQ is a dataset for answer sentence selection derived from Google's Natural Questions (NQ) dataset (Kwiatkowski et al. 2019).
Each example contains a question, candidate sentence, label indicating whether or not the sentence answers the question, and two additional features -- sentence_in_long_answer and short_answer_in_sentence indicating whether ot not the candidate sentence is contained in the long_answer and if the short_answer is in the candidate sentence.
For more details please see https://arxiv.org/abs/1911.04118
and
https://research.google/pubs/pub47761/
An example of 'validation' looks as follows.
{ "label": 0, "question": "when did somewhere over the rainbow come out", "sentence": "In films and TV shows ( edit ) In the film Third Finger , Left Hand ( 1940 ) with Myrna Loy , Melvyn Douglas , and Raymond Walburn , the tune played throughout the film in short sequences .", "sentence_in_long_answer": false, "short_answer_in_sentence": false }
The data fields are the same among all splits.
defaultname | train | validation |
---|---|---|
default | 20377568 | 930062 |
The data is made available under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License: https://github.com/alexa/wqa_tanda/blob/master/LICENSE
@article{Garg_2020, title={TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection}, volume={34}, ISSN={2159-5399}, url={http://dx.doi.org/10.1609/AAAI.V34I05.6282}, DOI={10.1609/aaai.v34i05.6282}, number={05}, journal={Proceedings of the AAAI Conference on Artificial Intelligence}, publisher={Association for the Advancement of Artificial Intelligence (AAAI)}, author={Garg, Siddhant and Vu, Thuy and Moschitti, Alessandro}, year={2020}, month={Apr}, pages={7780–7788} }
Thanks to @mkserge for adding this dataset.