数据集:
strombergnlp/nlpcc-stance
任务:
文本分类子任务:
sentiment-analysis语言:
zh计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
expert-generated源数据集:
original其他:
stance-detection许可:
cc-by-4.0This is a stance prediction dataset in Chinese. The data is that from a shared task, stance detection in Chinese microblogs, in NLPCC-ICCPOL 2016. It covers Task A, a mandatory supervised task which detects stance towards five targets of interest with given labeled data. Some instances of the dataset have been removed, as they were without label.
Chinese, as spoken on the Weibo website ( bcp47:zh )
Example instance:
{ 'id': '0', 'target': 'IphoneSE', 'text': '3月31日,苹果iPhone SE正式开卖,然而这款小屏新机并未出现人们预想的疯抢局面。根据市场分析机构Localytics周一公布的数据,iPhone SE正式上市的这个周末,销量成绩并不算太好。', 'stance': 2 }
The training split has 2986 instances
The goal was to create a dataset of microblog text annotated for stance. Six stance targets were selected and data was collected from Sina Weibo for annotation.
Not specified
Who are the source language producers?Sina Weibo users
The stance of each target-microblog pair is duplicated annotated by two students individually. If these two students provide the same annotation, the stance of this microblog-target pair is then labeled. If the different annotation is detected, the third student will be assigned to annotate this pair. Their annotation results will be voted to obtain the final label.
Who are the annotators?Students in China
No reflections
The data preserves social media utterances verbatim and so has obviated any right to be forgotten, though usernames and post IDs are not explicitly included in the data.
There'll be at least a temporal and regional bias to this data, as well as it only representing expressions of stance on six topics.
The dataset is curated by the paper's authors.
The authors distribute this data under Creative Commons attribution license, CC-BY 4.0.
@incollection{xu2016overview, title={Overview of nlpcc shared task 4: Stance detection in chinese microblogs}, author={Xu, Ruifeng and Zhou, Yu and Wu, Dongyin and Gui, Lin and Du, Jiachen and Xue, Yun}, booktitle={Natural language understanding and intelligent applications}, pages={907--916}, year={2016}, publisher={Springer} }