数据集:
HuggingFaceH4/helpful_instructions
Helpful Instructions is a dataset of (instruction, completion) pairs that are derived from public datasets. As the name suggests, it focuses on instructions that are "helpful", i.e. the kind of questions or tasks a human user might instruct an AI assistant to perform. You can load the dataset as follows:
from datasets import load_dataset # Load all subsets helpful_instructions = load_dataset("HuggingFaceH4/helpful_instructions", name="all") # Load a single subset helpful_instructions_subset = load_dataset("HuggingFaceH4/helpful_instructions", name="self_instruct")
This dataset can be used to fine-tune pretrained language models to follow instructions.