数据集:
lmqg/qg_tweetqa
任务:
文本生成子任务:
language-modeling语言:
en计算机处理:
monolingual大小:
1K<n<10K源数据集:
tweet_qa预印本库:
arxiv:2210.03992许可:
cc-by-sa-4.0This is the question & answer generation dataset based on the tweet_qa . The test set of the original data is not publicly released, so we randomly sampled test questions from the training set.
English (en)
An example of 'train' looks as follows.
{ 'answer': 'vine', 'paragraph_question': 'question: what site does the link take you to?, context:5 years in 5 seconds. Darren Booth (@darbooth) January 25, 2013', 'question': 'what site does the link take you to?', 'paragraph': '5 years in 5 seconds. Darren Booth (@darbooth) January 25, 2013' }
The data fields are the same among all splits.
train | validation | test |
---|---|---|
9489 | 1086 | 1203 |
@inproceedings{ushio-etal-2022-generative, title = "{G}enerative {L}anguage {M}odels for {P}aragraph-{L}evel {Q}uestion {G}eneration", author = "Ushio, Asahi and Alva-Manchego, Fernando and Camacho-Collados, Jose", booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing", month = dec, year = "2022", address = "Abu Dhabi, U.A.E.", publisher = "Association for Computational Linguistics", }