数据集:
nightingal3/fig-qa
任务:
多项选择子任务:
multiple-choice-qa语言:
en计算机处理:
monolingual大小:
10K<n<100K语言创建人:
crowdsourced源数据集:
original预印本库:
arxiv:2204.12632许可:
mitThis is the dataset for the paper Testing the Ability of Language Models to Interpret Figurative Language . Fig-QA consists of 10256 examples of human-written creative metaphors that are paired as a Winograd schema. It can be used to evaluate the commonsense reasoning of models. The metaphors themselves can also be used as training data for other tasks, such as metaphor detection or generation.
You can evaluate your models on the test set by submitting to the leaderboard on Explainaboard. Click on "New" and select qa-multiple-choice for the task field. Select accuracy for the metric. You should upload results in the form of a system output file in JSON or JSONL format.
This is the English version. Multilingual version can be found here .
Train-{S, M(no suffix), XL}: different training set sizes Dev Test (labels not provided for test set)
These metaphors are human-generated and may contain insults or other explicit content. Authors of the paper manually removed offensive content, but users should keep in mind that some potentially offensive content may remain in the dataset.
MIT License
If you found the dataset useful, please cite this paper:
@misc{https://doi.org/10.48550/arxiv.2204.12632, doi = {10.48550/ARXIV.2204.12632}, url = {https://arxiv.org/abs/2204.12632}, author = {Liu, Emmy and Cui, Chen and Zheng, Kenneth and Neubig, Graham}, keywords = {Computation and Language (cs.CL), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Testing the Ability of Language Models to Interpret Figurative Language}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution Share Alike 4.0 International} }