数据集:
bigbio/twadrl
The TwADR-L dataset contains medical concepts written on social media (Twitter) mapped to how they are formally written in medical ontologies (SIDER 4).
@inproceedings{limsopatham-collier-2016-normalising,
title = "Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation",
author = "Limsopatham, Nut and
Collier, Nigel",
booktitle = "Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = aug,
year = "2016",
address = "Berlin, Germany",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/P16-1096",
doi = "10.18653/v1/P16-1096",
pages = "1014--1023",
}