数据集:
biosses
任务:
文本分类语言:
en计算机处理:
monolingual大小:
n<1K语言创建人:
found批注创建人:
expert-generated源数据集:
original许可:
gpl-3.0BIOSSES is a benchmark dataset for biomedical sentence similarity estimation. The dataset comprises 100 sentence pairs, in which each sentence was selected from the TAC (Text Analysis Conference) Biomedical Summarization Track Training Dataset containing articles from the biomedical domain. The sentence pairs in BIOSSES were selected from citing sentences, i.e. sentences that have a citation to a reference article.
The sentence pairs were evaluated by five different human experts that judged their similarity and gave scores ranging from 0 (no relation) to 4 (equivalent). In the original paper the mean of the scores assigned by the five human annotators was taken as the gold standard. The Pearson correlation between the gold standard scores and the scores estimated by the models was used as the evaluation metric. The strength of correlation can be assessed by the general guideline proposed by Evans (1996) as follows:
Biomedical Semantic Similarity Scoring.
English.
For each instance, there are two sentences (i.e. sentence 1 and 2), and its corresponding similarity score (the mean of the scores assigned by the five human annotators).
{'sentence 1': 'Here, looking for agents that could specifically kill KRAS mutant cells, they found that knockdown of GATA2 was synthetically lethal with KRAS mutation' 'sentence 2': 'Not surprisingly, GATA2 knockdown in KRAS mutant cells resulted in a striking reduction of active GTP-bound RHO proteins, including the downstream ROCK kinase' 'score': 2.2}
No data splits provided.
The TAC (Text Analysis Conference) Biomedical Summarization Track Training Dataset .
Initial Data Collection and Normalization[More Information Needed]
Who are the source language producers?[More Information Needed]
The sentence pairs were evaluated by five different human experts that judged their similarity and gave scores ranging from 0 (no relation) to 4 (equivalent). The score range was described based on the guidelines of SemEval 2012 Task 6 on STS (Agirre et al., 2012). Besides the annotation instructions, example sentences from the biomedical literature were provided to the annotators for each of the similarity degrees.
The table below shows the Pearson correlation of the scores of each annotator with respect to the average scores of the remaining four annotators. It is observed that there is strong association among the scores of the annotators. The lowest correlations are 0.902, which can be considered as an upper bound for an algorithmic measure evaluated on this dataset.
Correlation r | |
---|---|
Annotator A | 0.952 |
Annotator B | 0.958 |
Annotator C | 0.917 |
Annotator D | 0.902 |
Annotator E | 0.941 |
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
BIOSSES is made available under the terms of The GNU Common Public License v.3.0 .
@article{souganciouglu2017biosses, title={BIOSSES: a semantic sentence similarity estimation system for the biomedical domain}, author={So{\u{g}}anc{\i}o{\u{g}}lu, Gizem and {"O}zt{"u}rk, Hakime and {"O}zg{"u}r, Arzucan}, journal={Bioinformatics}, volume={33}, number={14}, pages={i49--i58}, year={2017}, publisher={Oxford University Press} }
Thanks to @bwang482 for adding this dataset.