模型:

facebook/wav2vec2-large-uralic-voxpopuli-v2

中文

Wav2Vec2-large-VoxPopuli-V2

Facebook's Wav2Vec2 large model pretrained only in uralic on 42.5 unlabeled datat of the VoxPopuli corpus .

The model is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

Note : This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model for speech recognition , a tokenizer should be created and the model should be fine-tuned on labeled text data in uralic . Check out this blog for a more in-detail explanation of how to fine-tune the model.

Paper : VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

Authors : Changhan Wang, Morgane Riviere, Ann Lee, Anne Wu, Chaitanya Talnikar, Daniel Haziza, Mary Williamson, Juan Pino, Emmanuel Dupoux from Facebook AI .

See the official website for more information, here .