数据集:
a6kme/minds14-mirror
任务:
子任务:
keyword-spotting计算机处理:
multilingual大小:
10K<n<100K预印本库:
arxiv:2104.08524许可:
MINDS-14 is training and evaluation resource for intent detection task with spoken data. It covers 14 intents extracted from a commercial system in the e-banking domain, associated with spoken examples in 14 diverse language varieties.
MInDS-14 can be downloaded and used as follows:
from datasets import load_dataset
minds_14 = load_dataset("PolyAI/minds14", "fr-FR") # for French
# to download all data for multi-lingual fine-tuning uncomment following line
# minds_14 = load_dataset("PolyAI/all", "all")
# see structure
print(minds_14)
# load audio sample on the fly
audio_input = minds_14["train"][0]["audio"] # first decoded audio sample
intent_class = minds_14["train"][0]["intent_class"] # first transcription
intent = minds_14["train"].features["intent_class"].names[intent_class]
# use audio_input and language_class to fine-tune your model for audio classification
We show detailed information the example configurations fr-FR of the dataset. All other configurations have the same structure.
fr-FR
An example of a datainstance of the config fr-FR looks as follows:
{
"path": "/home/patrick/.cache/huggingface/datasets/downloads/extracted/3ebe2265b2f102203be5e64fa8e533e0c6742e72268772c8ac1834c5a1a921e3/fr-FR~ADDRESS/response_4.wav",
"audio": {
"path": "/home/patrick/.cache/huggingface/datasets/downloads/extracted/3ebe2265b2f102203be5e64fa8e533e0c6742e72268772c8ac1834c5a1a921e3/fr-FR~ADDRESS/response_4.wav",
"array": array(
[0.0, 0.0, 0.0, ..., 0.0, 0.00048828, -0.00024414], dtype=float32
),
"sampling_rate": 8000,
},
"transcription": "je souhaite changer mon adresse",
"english_transcription": "I want to change my address",
"intent_class": 1,
"lang_id": 6,
}
The data fields are the same among all splits.
Every config only has the "train" split containing of ca. 600 examples.
All datasets are licensed under the Creative Commons license (CC-BY) .
@article{DBLP:journals/corr/abs-2104-08524,
author = {Daniela Gerz and
Pei{-}Hao Su and
Razvan Kusztos and
Avishek Mondal and
Michal Lis and
Eshan Singhal and
Nikola Mrksic and
Tsung{-}Hsien Wen and
Ivan Vulic},
title = {Multilingual and Cross-Lingual Intent Detection from Spoken Data},
journal = {CoRR},
volume = {abs/2104.08524},
year = {2021},
url = {https://arxiv.org/abs/2104.08524},
eprinttype = {arXiv},
eprint = {2104.08524},
timestamp = {Mon, 26 Apr 2021 17:25:10 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2104-08524.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Thanks to @patrickvonplaten for adding this dataset