中文

Dataset accompanying the "Probing neural language models for understanding of words of estimative probability" article

This dataset tests the capabilities of language models to correctly capture the meaning of words denoting probabilities (WEP, also called verbal probabilities), e.g. words like "probably", "maybe", "surely", "impossible".

We used probabilitic soft logic to combine probabilistic statements expressed with WEP (WEP-Reasoning) and we also used the UNLI dataset ( https://nlp.jhu.edu/unli/ ) to directly check whether models can detect the WEP matching human-annotated probabilities according to Fagen-Ulmschneider, 2018 . The dataset can be used as natural language inference data (context, premise, label) or multiple choice question answering (context,valid_hypothesis, invalid_hypothesis).

Code : colab

Accepted at Starsem2023 (The 12th Joint Conference on Lexical and Computational Semantics). Temporary citation:

@article{sileo2022probing,
  title={Probing neural language models for understanding of words of estimative probability},
  author={Sileo, Damien and Moens, Marie-Francine},
  journal={arXiv preprint arXiv:2211.03358},
  year={2022}
}