数据集:
facebook/babi_qa
任务:
问答语言:
en计算机处理:
monolingual语言创建人:
machine-generated批注创建人:
machine-generated源数据集:
original其他:
chained-qa许可:
cc-by-3.0The (20) QA bAbI tasks are a set of proxy tasks that evaluate reading comprehension via question answering. Our tasks measure understanding in several ways: whether a system is able to answer questions via chaining facts, simple induction, deduction and many more. The tasks are designed to be prerequisites for any system that aims to be capable of conversing with a human. The aim is to classify these tasks into skill sets,so that researchers can identify (and then rectify) the failings of their systems.
The dataset supports a set of 20 proxy story-based question answering tasks for various "types" in English and Hindi. The tasks are:
task_no | task_name |
---|---|
qa1 | single-supporting-fact |
qa2 | two-supporting-facts |
qa3 | three-supporting-facts |
qa4 | two-arg-relations |
qa5 | three-arg-relations |
qa6 | yes-no-questions |
qa7 | counting |
qa8 | lists-sets |
qa9 | simple-negation |
qa10 | indefinite-knowledge |
qa11 | basic-coreference |
qa12 | conjunction |
qa13 | compound-coreference |
qa14 | time-reasoning |
qa15 | basic-deduction |
qa16 | basic-induction |
qa17 | positional-reasoning |
qa18 | size-reasoning |
qa19 | path-finding |
qa20 | agents-motivations |
The "types" are are:
en
hn
shuffled
en-10k , shuffled-10k and hn-10k
en-valid and en-valid-10k
To get a particular dataset, use load_dataset('babi_qa',type=f'{type}',task_no=f'{task_no}') where type is one of the types, and task_no is one of the task numbers. For example, load_dataset('babi_qa', type='en', task_no='qa1') .
An instance from the en-qa1 config's train split:
{'story': {'answer': ['', '', 'bathroom', '', '', 'hallway', '', '', 'hallway', '', '', 'office', '', '', 'bathroom'], 'id': ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15'], 'supporting_ids': [[], [], ['1'], [], [], ['4'], [], [], ['4'], [], [], ['11'], [], [], ['8']], 'text': ['Mary moved to the bathroom.', 'John went to the hallway.', 'Where is Mary?', 'Daniel went back to the hallway.', 'Sandra moved to the garden.', 'Where is Daniel?', 'John moved to the office.', 'Sandra journeyed to the bathroom.', 'Where is Daniel?', 'Mary moved to the hallway.', 'Daniel travelled to the office.', 'Where is Daniel?', 'John went back to the garden.', 'John moved to the bedroom.', 'Where is Sandra?'], 'type': [0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1]}}
The splits and corresponding sizes are:
train | test | validation | |
---|---|---|---|
en-qa1 | 200 | 200 | - |
en-qa2 | 200 | 200 | - |
en-qa3 | 200 | 200 | - |
en-qa4 | 1000 | 1000 | - |
en-qa5 | 200 | 200 | - |
en-qa6 | 200 | 200 | - |
en-qa7 | 200 | 200 | - |
en-qa8 | 200 | 200 | - |
en-qa9 | 200 | 200 | - |
en-qa10 | 200 | 200 | - |
en-qa11 | 200 | 200 | - |
en-qa12 | 200 | 200 | - |
en-qa13 | 200 | 200 | - |
en-qa14 | 200 | 200 | - |
en-qa15 | 250 | 250 | - |
en-qa16 | 1000 | 1000 | - |
en-qa17 | 125 | 125 | - |
en-qa18 | 198 | 199 | - |
en-qa19 | 1000 | 1000 | - |
en-qa20 | 94 | 93 | - |
en-10k-qa1 | 2000 | 200 | - |
en-10k-qa2 | 2000 | 200 | - |
en-10k-qa3 | 2000 | 200 | - |
en-10k-qa4 | 10000 | 1000 | - |
en-10k-qa5 | 2000 | 200 | - |
en-10k-qa6 | 2000 | 200 | - |
en-10k-qa7 | 2000 | 200 | - |
en-10k-qa8 | 2000 | 200 | - |
en-10k-qa9 | 2000 | 200 | - |
en-10k-qa10 | 2000 | 200 | - |
en-10k-qa11 | 2000 | 200 | - |
en-10k-qa12 | 2000 | 200 | - |
en-10k-qa13 | 2000 | 200 | - |
en-10k-qa14 | 2000 | 200 | - |
en-10k-qa15 | 2500 | 250 | - |
en-10k-qa16 | 10000 | 1000 | - |
en-10k-qa17 | 1250 | 125 | - |
en-10k-qa18 | 1978 | 199 | - |
en-10k-qa19 | 10000 | 1000 | - |
en-10k-qa20 | 933 | 93 | - |
en-valid-qa1 | 180 | 200 | 20 |
en-valid-qa2 | 180 | 200 | 20 |
en-valid-qa3 | 180 | 200 | 20 |
en-valid-qa4 | 900 | 1000 | 100 |
en-valid-qa5 | 180 | 200 | 20 |
en-valid-qa6 | 180 | 200 | 20 |
en-valid-qa7 | 180 | 200 | 20 |
en-valid-qa8 | 180 | 200 | 20 |
en-valid-qa9 | 180 | 200 | 20 |
en-valid-qa10 | 180 | 200 | 20 |
en-valid-qa11 | 180 | 200 | 20 |
en-valid-qa12 | 180 | 200 | 20 |
en-valid-qa13 | 180 | 200 | 20 |
en-valid-qa14 | 180 | 200 | 20 |
en-valid-qa15 | 225 | 250 | 25 |
en-valid-qa16 | 900 | 1000 | 100 |
en-valid-qa17 | 113 | 125 | 12 |
en-valid-qa18 | 179 | 199 | 19 |
en-valid-qa19 | 900 | 1000 | 100 |
en-valid-qa20 | 85 | 93 | 9 |
en-valid-10k-qa1 | 1800 | 200 | 200 |
en-valid-10k-qa2 | 1800 | 200 | 200 |
en-valid-10k-qa3 | 1800 | 200 | 200 |
en-valid-10k-qa4 | 9000 | 1000 | 1000 |
en-valid-10k-qa5 | 1800 | 200 | 200 |
en-valid-10k-qa6 | 1800 | 200 | 200 |
en-valid-10k-qa7 | 1800 | 200 | 200 |
en-valid-10k-qa8 | 1800 | 200 | 200 |
en-valid-10k-qa9 | 1800 | 200 | 200 |
en-valid-10k-qa10 | 1800 | 200 | 200 |
en-valid-10k-qa11 | 1800 | 200 | 200 |
en-valid-10k-qa12 | 1800 | 200 | 200 |
en-valid-10k-qa13 | 1800 | 200 | 200 |
en-valid-10k-qa14 | 1800 | 200 | 200 |
en-valid-10k-qa15 | 2250 | 250 | 250 |
en-valid-10k-qa16 | 9000 | 1000 | 1000 |
en-valid-10k-qa17 | 1125 | 125 | 125 |
en-valid-10k-qa18 | 1781 | 199 | 197 |
en-valid-10k-qa19 | 9000 | 1000 | 1000 |
en-valid-10k-qa20 | 840 | 93 | 93 |
hn-qa1 | 200 | 200 | - |
hn-qa2 | 200 | 200 | - |
hn-qa3 | 167 | 167 | - |
hn-qa4 | 1000 | 1000 | - |
hn-qa5 | 200 | 200 | - |
hn-qa6 | 200 | 200 | - |
hn-qa7 | 200 | 200 | - |
hn-qa8 | 200 | 200 | - |
hn-qa9 | 200 | 200 | - |
hn-qa10 | 200 | 200 | - |
hn-qa11 | 200 | 200 | - |
hn-qa12 | 200 | 200 | - |
hn-qa13 | 125 | 125 | - |
hn-qa14 | 200 | 200 | - |
hn-qa15 | 250 | 250 | - |
hn-qa16 | 1000 | 1000 | - |
hn-qa17 | 125 | 125 | - |
hn-qa18 | 198 | 198 | - |
hn-qa19 | 1000 | 1000 | - |
hn-qa20 | 93 | 94 | - |
hn-10k-qa1 | 2000 | 200 | - |
hn-10k-qa2 | 2000 | 200 | - |
hn-10k-qa3 | 1667 | 167 | - |
hn-10k-qa4 | 10000 | 1000 | - |
hn-10k-qa5 | 2000 | 200 | - |
hn-10k-qa6 | 2000 | 200 | - |
hn-10k-qa7 | 2000 | 200 | - |
hn-10k-qa8 | 2000 | 200 | - |
hn-10k-qa9 | 2000 | 200 | - |
hn-10k-qa10 | 2000 | 200 | - |
hn-10k-qa11 | 2000 | 200 | - |
hn-10k-qa12 | 2000 | 200 | - |
hn-10k-qa13 | 1250 | 125 | - |
hn-10k-qa14 | 2000 | 200 | - |
hn-10k-qa15 | 2500 | 250 | - |
hn-10k-qa16 | 10000 | 1000 | - |
hn-10k-qa17 | 1250 | 125 | - |
hn-10k-qa18 | 1977 | 198 | - |
hn-10k-qa19 | 10000 | 1000 | - |
hn-10k-qa20 | 934 | 94 | - |
shuffled-qa1 | 200 | 200 | - |
shuffled-qa2 | 200 | 200 | - |
shuffled-qa3 | 200 | 200 | - |
shuffled-qa4 | 1000 | 1000 | - |
shuffled-qa5 | 200 | 200 | - |
shuffled-qa6 | 200 | 200 | - |
shuffled-qa7 | 200 | 200 | - |
shuffled-qa8 | 200 | 200 | - |
shuffled-qa9 | 200 | 200 | - |
shuffled-qa10 | 200 | 200 | - |
shuffled-qa11 | 200 | 200 | - |
shuffled-qa12 | 200 | 200 | - |
shuffled-qa13 | 200 | 200 | - |
shuffled-qa14 | 200 | 200 | - |
shuffled-qa15 | 250 | 250 | - |
shuffled-qa16 | 1000 | 1000 | - |
shuffled-qa17 | 125 | 125 | - |
shuffled-qa18 | 198 | 199 | - |
shuffled-qa19 | 1000 | 1000 | - |
shuffled-qa20 | 94 | 93 | - |
shuffled-10k-qa1 | 2000 | 200 | - |
shuffled-10k-qa2 | 2000 | 200 | - |
shuffled-10k-qa3 | 2000 | 200 | - |
shuffled-10k-qa4 | 10000 | 1000 | - |
shuffled-10k-qa5 | 2000 | 200 | - |
shuffled-10k-qa6 | 2000 | 200 | - |
shuffled-10k-qa7 | 2000 | 200 | - |
shuffled-10k-qa8 | 2000 | 200 | - |
shuffled-10k-qa9 | 2000 | 200 | - |
shuffled-10k-qa10 | 2000 | 200 | - |
shuffled-10k-qa11 | 2000 | 200 | - |
shuffled-10k-qa12 | 2000 | 200 | - |
shuffled-10k-qa13 | 2000 | 200 | - |
shuffled-10k-qa14 | 2000 | 200 | - |
shuffled-10k-qa15 | 2500 | 250 | - |
shuffled-10k-qa16 | 10000 | 1000 | - |
shuffled-10k-qa17 | 1250 | 125 | - |
shuffled-10k-qa18 | 1978 | 199 | - |
shuffled-10k-qa19 | 10000 | 1000 | - |
shuffled-10k-qa20 | 933 | 93 | - |
[More Information Needed]
Code to generate tasks is available on github
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
Jesse Dodge and Andreea Gane and Xiang Zhang and Antoine Bordes and Sumit Chopra and Alexander Miller and Arthur Szlam and Jason Weston, at Facebook Research.
Creative Commons Attribution 3.0 License
@misc{dodge2016evaluating, title={Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems}, author={Jesse Dodge and Andreea Gane and Xiang Zhang and Antoine Bordes and Sumit Chopra and Alexander Miller and Arthur Szlam and Jason Weston}, year={2016}, eprint={1511.06931}, archivePrefix={arXiv}, primaryClass={cs.CL} }
Thanks to @gchhablani for adding this dataset.