数据集:
bigbio/twadrl
The TwADR-L dataset contains medical concepts written on social media (Twitter) mapped to how they are formally written in medical ontologies (SIDER 4).
@inproceedings{limsopatham-collier-2016-normalising, title = "Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation", author = "Limsopatham, Nut and Collier, Nigel", booktitle = "Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", month = aug, year = "2016", address = "Berlin, Germany", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/P16-1096", doi = "10.18653/v1/P16-1096", pages = "1014--1023", }