数据集:

olm/olm-wikipedia-20221220

中文

Dataset Card for OLM December 2022 Wikipedia

Pretraining dataset, created with the OLM repo here from a December 2022 Wikipedia snapshot.