数据集:
tasksource/tasksource-instruct-v0
Multi-task instruction-tuning data recasted from 485 of the tasksource datasets. Dataset size is capped at 30k examples per task to foster task diversity.
!pip install tasksource, pandit import tasksource, pandit df = tasksource.list_tasks(instruct=True).sieve(id=lambda x: 'mmlu' not in x) for tasks in df.id: yield tasksource.load_task(task,instruct=True,max_rows=30_000,max_rows_eval=200)
https://github.com/sileod/tasksource
TSI is HuggingFace-centric and based on tasksource, a curated collection of HF datasets. It can be scaled to much more examples. tasksource is focused on discriminative tasks (Classification/TokenClassification/MultipleChoice). The coverage on discriminative tasks is greater than flan. List of tasks here . Examples of tasks not in Flan V2 include Dynasent (adversarial sentiment analysis), Dynahate (adversarial hate speech detection, discriminative babi, epistemic logic, ruletaker, veridicality, discourse relation prediction, dozens of interesting natural language inference datasets...
TSI answers are mostly short answers to multiple-choice questions, but they target a wide array of problems. TSI is reasoning intensive, while some flan tasks are not necessarily specific (e.g. generating hypothesis based on premise for NLI). We explicitly mention that answers should not have explanations, to prevent biasing models toward short answers when using other instruction datasets.
flan-v2 and tasksource-instruct can be combined to improve the reasoning capabilities of LLM.
damien.sileo@inria.fr
https://arxiv.org/abs/2301.05948
@article{sileo2023tasksource, title={tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation}, author={Sileo, Damien}, url= {https://arxiv.org/abs/2301.05948}, journal={arXiv preprint arXiv:2301.05948}, year={2023} }