The Medical Question and Answering dataset(MQuAD) has been refined, including the following datasets. You can download it through the Hugging Face dataset. Use the DATASETS method as follows.
from datasets import load_dataset dataset = load_dataset("danielpark/MQuAD-v1")
Medical Q/A datasets gathered from the following websites.
The MQuAD provides embedded question and answer arrays in string format, so it is recommended to convert the string-formatted arrays into float format as follows. This measure has been applied to save resources and time used for embedding.
from datasets import load_dataset from utilfunction import col_convert import pandas as pd qa = load_dataset("danielpark/MQuAD-v1", "csv") df_qa = pd.DataFrame(qa['train']) df_qa = col_convert(df_qa, ['Q_FFNN_embeds', 'A_FFNN_embeds'])