数据集:
bigbio/mqp
Medical Question Pairs dataset by McCreery et al (2020) contains pairs of medical questions and paraphrased versions of the question prepared by medical professional. Paraphrased versions were labelled as similar (syntactically dissimilar but contextually similar ) or dissimilar (syntactically may look similar but contextually dissimilar). Labels 1: similar, 0: dissimilar
@article{DBLP:journals/biodb/LiSJSWLDMWL16, author = {Krallinger, M., Rabal, O., Lourenço, A.}, title = {Effective Transfer Learning for Identifying Similar Questions: Matching User Questions to COVID-19 FAQs}, journal = {KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining}, volume = {3458–3465}, year = {2020}, url = {https://github.com/curai/medical-question-pair-dataset}, doi = {}, biburl = {}, bibsource = {} }