数据集:
Baybars/parla_text_corpus
子任务:
language-modeling语言:
ca计算机处理:
monolingual大小:
100K<n<1M语言创建人:
various批注创建人:
no-annotation源数据集:
found许可:
cc-by-4.0Spoken text corpus for Catalan. Derived and cleaned from three sources. OpenSubtitles, Tv3Parla and Festcat.