The Medical Question and Answering dataset(MQuAD) has been refined, including the following datasets. You can download it through the Hugging Face dataset. Use the DATASETS method as follows.
from datasets import load_dataset
dataset = load_dataset("danielpark/MQuAD-v1")
Medical Q/A datasets gathered from the following websites.
The MQuAD provides embedded question and answer arrays in string format, so it is recommended to convert the string-formatted arrays into float format as follows. This measure has been applied to save resources and time used for embedding.
from datasets import load_dataset
from utilfunction import col_convert
import pandas as pd
qa = load_dataset("danielpark/MQuAD-v1", "csv")
df_qa = pd.DataFrame(qa['train'])
df_qa = col_convert(df_qa, ['Q_FFNN_embeds', 'A_FFNN_embeds'])