数据集:
NbAiLab/norec_agg
任务:
文本分类语言:
en计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
expert-generated源数据集:
original预印本库:
arxiv:2011.02686许可:
cc-by-4.0Aggregated NoRec_fine: A Fine-grained Sentiment Dataset for Norwegian. This dataset was created by the Nordic Language Processing Laboratory by aggregating the fine-grained annotations in NoReC_fine and removing sentences with conflicting or no sentiment.
[More Information Needed]
The text in the dataset is in Norwegian.
Example of one instance in the dataset.
{'label': 0, 'text': 'Verre er det med slagsmålene .'}
The dataset is split into a train , validation , and test split with the following sizes:
Tain | Valid | Test | |
---|---|---|---|
Number of examples | 2675 | 516 | 417 |
This dataset is based largely on the original data described in the paper A Fine-Grained Sentiment Dataset for Norwegian by L. Øvrelid, P. Mæhlum, J. Barnes, and E. Velldal, accepted at LREC 2020, paper available . However, we have since added annotations for another 3476 sentences, increasing the overall size and scope of the dataset.
This work is licensed under a Creative Commons Attribution 4.0 International License
@misc{sheng2020investigating, title={Investigating Societal Biases in a Poetry Composition System}, author={Emily Sheng and David Uthus}, year={2020}, eprint={2011.02686}, archivePrefix={arXiv}, primaryClass={cs.CL} }