Fine-tuned on Multilingual Pretrained Model
CLSRIL-23
. The original fairseq checkpoint is present
here
. When using this model, make sure that your speech input is sampled at 16kHz.
Note: The result from this model is without a language model so you may witness a higher WER in some cases.