数据集:

acul3/KoPI-CC_News

语言:

id

语言创建人:

found

批注创建人:

no-annotation

源数据集:

original

许可:

cc
中文

Dataset Summary

KoPI(Korpus Perayapan Indonesia)-CC_News is Indonesian Only Extract from CC NEWS Common Crawl from 2016-2022(july) ,each snapshots get extracted using warcio,trafilatura and filter using fasttext

detail soon