数据集:
movie_rationales
任务:
语言:
计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
crowdsourced源数据集:
original许可:
The movie rationale dataset contains human annotated rationales for movie reviews.
An example of 'validation' looks as follows.
{
"evidences": ["Fun movie"],
"label": 1,
"review": "Fun movie\n"
}
The data fields are the same among all splits.
default| name | train | validation | test |
|---|---|---|---|
| default | 1600 | 200 | 199 |
@inproceedings{deyoung-etal-2020-eraser,
title = "{ERASER}: {A} Benchmark to Evaluate Rationalized {NLP} Models",
author = "DeYoung, Jay and
Jain, Sarthak and
Rajani, Nazneen Fatema and
Lehman, Eric and
Xiong, Caiming and
Socher, Richard and
Wallace, Byron C.",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2020.acl-main.408",
doi = "10.18653/v1/2020.acl-main.408",
pages = "4443--4458",
}
@InProceedings{zaidan-eisner-piatko-2008:nips,
author = {Omar F. Zaidan and Jason Eisner and Christine Piatko},
title = {Machine Learning with Annotator Rationales to Reduce Annotation Cost},
booktitle = {Proceedings of the NIPS*2008 Workshop on Cost Sensitive Learning},
month = {December},
year = {2008}
}
Thanks to @thomwolf , @patrickvonplaten , @lewtun for adding this dataset.