数据集:
PolyAI/minds14
任务:
自动语音识别子任务:
keyword-spotting计算机处理:
multilingual大小:
10K<n<100K预印本库:
arxiv:2104.08524许可:
cc-by-4.0MINDS-14 is training and evaluation resource for intent detection task with spoken data. It covers 14 intents extracted from a commercial system in the e-banking domain, associated with spoken examples in 14 diverse language varieties.
MInDS-14 can be downloaded and used as follows:
from datasets import load_dataset minds_14 = load_dataset("PolyAI/minds14", "fr-FR") # for French # to download all data for multi-lingual fine-tuning uncomment following line # minds_14 = load_dataset("PolyAI/all", "all") # see structure print(minds_14) # load audio sample on the fly audio_input = minds_14["train"][0]["audio"] # first decoded audio sample intent_class = minds_14["train"][0]["intent_class"] # first transcription intent = minds_14["train"].features["intent_class"].names[intent_class] # use audio_input and language_class to fine-tune your model for audio classification
We show detailed information the example configurations fr-FR of the dataset. All other configurations have the same structure.
fr-FR
An example of a datainstance of the config fr-FR looks as follows:
{ "path": "/home/patrick/.cache/huggingface/datasets/downloads/extracted/3ebe2265b2f102203be5e64fa8e533e0c6742e72268772c8ac1834c5a1a921e3/fr-FR~ADDRESS/response_4.wav", "audio": { "path": "/home/patrick/.cache/huggingface/datasets/downloads/extracted/3ebe2265b2f102203be5e64fa8e533e0c6742e72268772c8ac1834c5a1a921e3/fr-FR~ADDRESS/response_4.wav", "array": array( [0.0, 0.0, 0.0, ..., 0.0, 0.00048828, -0.00024414], dtype=float32 ), "sampling_rate": 8000, }, "transcription": "je souhaite changer mon adresse", "english_transcription": "I want to change my address", "intent_class": 1, "lang_id": 6, }
The data fields are the same among all splits.
Every config only has the "train" split containing of ca. 600 examples.
All datasets are licensed under the Creative Commons license (CC-BY) .
@article{DBLP:journals/corr/abs-2104-08524, author = {Daniela Gerz and Pei{-}Hao Su and Razvan Kusztos and Avishek Mondal and Michal Lis and Eshan Singhal and Nikola Mrksic and Tsung{-}Hsien Wen and Ivan Vulic}, title = {Multilingual and Cross-Lingual Intent Detection from Spoken Data}, journal = {CoRR}, volume = {abs/2104.08524}, year = {2021}, url = {https://arxiv.org/abs/2104.08524}, eprinttype = {arXiv}, eprint = {2104.08524}, timestamp = {Mon, 26 Apr 2021 17:25:10 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2104-08524.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Thanks to @patrickvonplaten for adding this dataset