数据集:
snips_built_in_intents
任务:
文本分类语言:
en计算机处理:
monolingual大小:
n<1K语言创建人:
expert-generated批注创建人:
expert-generated源数据集:
original预印本库:
arxiv:1805.10190许可:
cc0-1.0Snips' built in intents dataset was initially used to compare different voice assistants and released as a public dataset hosted at https://github.com/sonos/nlu-benchmark in folder 2016-12-built-in-intents. The dataset contains 328 utterances over 10 intent classes. A related Medium post is https://medium.com/snips-ai/benchmarking-natural-language-understanding-systems-d35be6ce568d .
There are no related shared tasks that we are aware of.
English
The dataset contains 328 utterances over 10 intent classes. Each sample looks like: {'label': 8, 'text': 'Transit directions to Barcelona Pizza.'}
The source data is not split.
The dataset was originally created to compare the performance of a number of voice assistants. However, the labelled utterances are useful for developing and benchmarking text chatbots as well.
It is not clear how the data was collected. From the Medium post: The benchmark relies on a set of 328 queries built by the business team at Snips, and kept secret from data scientists and engineers throughout the development of the solution.
Who are the source language producers?Originally prepared by snips.ai. The Snips team has since joined Sonos in November 2019. These open datasets remain available and their access is now managed by the Sonos Voice Experience Team. Please email sve-research@sonos.com with any question.
It is not clear how the data was collected. From the Medium post: The benchmark relies on a set of 328 queries built by the business team at Snips, and kept secret from data scientists and engineers throughout the development of the solution.
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
Originally prepared by snips.ai. The Snips team has since joined Sonos in November 2019. These open datasets remain available and their access is now managed by the Sonos Voice Experience Team. Please email sve-research@sonos.com with any question.
The source data is licensed under Creative Commons Zero v1.0 Universal.
Any publication based on these datasets must include a full citation to the following paper in which the results were published by the Snips Team:
Coucke A. et al., "Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces." CoRR 2018, https://arxiv.org/abs/1805.10190
Thanks to @bduvenhage for adding this dataset.