数据集:

lvwerra/stack-exchange-paired

中文

StackExchange Paired

This is a processed version of the HuggingFaceH4/stack-exchange-preferences . The following steps were applied:

  • Parse HTML to Markdown with markdownify
  • Create pairs (response_j, response_k) where j was rated better than k
  • Sample at most 10 pairs per question
  • Shuffle the dataset globally

This dataset is designed to be used for preference learning. The processing notebook is in the repository as well.