数据集:
TigerResearch/tigerbot-stackexchange-qa-en-0.5m
语言:
en许可:
apache-2.0Tigerbot 基于stackexchange问答站点dump数据生成sft数据集
原始来源: https://archive.org/details/stackexchange
import datasets ds_sft = datasets.load_dataset('TigerResearch/tigerbot-stackexchange-qa-en-0.5m')