数据集:

medalpaca/medical_meadow_wikidoc

任务:

问答

语言:

en

许可:

cc
中文

Dataset Card for WikiDoc

For the dataset containing patient information from wikidoc refer to this dataset

Dataset Summary

This dataset containes medical question-answer pairs extracted from WikiDoc , a collaborative platform for medical professionals to share and contribute to up-to-date medical knowledge. The platform has to main subsites, the "Living Textbook" and "Patient Information". The "Living Textbook" contains chapters for various medical specialties, which we crawled. We then used GTP-3.5-Turbo to rephrase the paragraph heading to a question and used the paragraph as answer. Patient Information is structured differently, in that each section subheading is already a question, making rephrasing them obsolete.

Note: This dataset is still a WIP. While the Q/A pairs from the patient information seems to be mostly correct, the conversion using GPT-3.5-Turbo yielded some unsatisfactory results in approximately 30% of cases. We are in the process of cleaning this dataset.

Citation Information

TBA