数据集:
spider
任务:
文生文语言:
en计算机处理:
monolingual大小:
1K<n<10K批注创建人:
expert-generated源数据集:
original其他:
text-to-sql许可:
cc-by-4.0Spider is a large-scale complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 Yale students The goal of the Spider challenge is to develop natural language interfaces to cross-domain databases
The leaderboard can be seen at https://yale-lily.github.io/spider
The text in the dataset is in English.
What do the instances that comprise the dataset represent?
Each instance is natural language question and the equivalent SQL query
How many instances are there in total?
What data does each instance consist of?
[More Information Needed]
train : 7000 questions and SQL query pairs dev : 1034 question and SQL query pairs
[More Information Needed]
[More Information Needed]
[More Information Needed]
The dataset was annotated by 11 college students at Yale University
Annotation process Who are the annotators?[More Information Needed]
[More Information Needed]
The listed authors in the homepage are maintaining/supporting the dataset.
[More Information Needed]
The spider dataset is licensed under the CC BY-SA 4.0
[More Information Needed]
@article{yu2018spider, title={Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task}, author={Yu, Tao and Zhang, Rui and Yang, Kai and Yasunaga, Michihiro and Wang, Dongxu and Li, Zifan and Ma, James and Li, Irene and Yao, Qingning and Roman, Shanelle and others}, journal={arXiv preprint arXiv:1809.08887}, year={2018} }
Thanks to @olinguyen for adding this dataset.