数据集:
TigerResearch/tigerbot-dolly-classification-en-2k
语言:
en许可:
apache-2.0Tigerbot 基于dolly数据集加工的分类classification相关分类的的sft。
原始来源: https://huggingface.co/datasets/databricks/databricks-dolly-15k
databricks-dolly-15k is an open source dataset of instruction-following records generated by thousands of Databricks employees in several of the behavioral categories outlined in the InstructGPT paper
import datasets ds_sft = datasets.load_dataset('TigerResearch/tigerbot-dolly-classification-en-2k')