数据集:

bigbio/ehr_rel

语言:

en

计算机处理:

monolingual

许可:

apache-2.0
中文

Dataset Card for EHR-Rel

EHR-Rel is a novel open-source1 biomedical concept relatedness dataset consisting of 3630 concept pairs, six times more than the largest existing dataset. Instead of manually selecting and pairing concepts as done in previous work, the dataset is sampled from EHRs to ensure concepts are relevant for the EHR concept retrieval task. A detailed analysis of the concepts in the dataset reveals a far larger coverage compared to existing datasets.

Citation Information

@inproceedings{schulz-etal-2020-biomedical,
    title = {Biomedical Concept Relatedness {--} A large {EHR}-based benchmark},
    author = {Schulz, Claudia  and
      Levy-Kramer, Josh  and
      Van Assel, Camille  and
      Kepes, Miklos  and
      Hammerla, Nils},
    booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
    month = {dec},
    year = {2020},
    address = {Barcelona, Spain (Online)},
    publisher = {International Committee on Computational Linguistics},
    url = {https://aclanthology.org/2020.coling-main.577},
    doi = {10.18653/v1/2020.coling-main.577},
    pages = {6565--6575},
    }