数据集:
squad_v1_pt
任务:
问答语言:
pt计算机处理:
monolingual大小:
10K<n<100K语言创建人:
crowdsourced批注创建人:
crowdsourced源数据集:
original预印本库:
arxiv:1606.05250许可:
mitPortuguese translation of the SQuAD dataset. The translation was performed automatically using the Google Cloud API.
An example of 'train' looks as follows.
This example was too long and was cropped: { "answers": { "answer_start": [0], "text": ["Saint Bernadette Soubirous"] }, "context": "\"Arquitetonicamente, a escola tem um caráter católico. No topo da cúpula de ouro do edifício principal é uma estátua de ouro da ...", "id": "5733be284776f41900661182", "question": "A quem a Virgem Maria supostamente apareceu em 1858 em Lourdes, na França?", "title": "University_of_Notre_Dame" }
The data fields are the same among all splits.
defaultname | train | validation |
---|---|---|
default | 87599 | 10570 |
@article{2016arXiv160605250R, author = {{Rajpurkar}, Pranav and {Zhang}, Jian and {Lopyrev}, Konstantin and {Liang}, Percy}, title = "{SQuAD: 100,000+ Questions for Machine Comprehension of Text}", journal = {arXiv e-prints}, year = 2016, eid = {arXiv:1606.05250}, pages = {arXiv:1606.05250}, archivePrefix = {arXiv}, eprint = {1606.05250}, }
Thanks to @thomwolf , @albertvillanova , @lewtun , @patrickvonplaten for adding this dataset.