数据集:
fewshot-goes-multilingual/cs_mall-product-reviews
许可:
cc-by-nc-sa-3.0源数据集:
original批注创建人:
found语言创建人:
found大小:
10K<n<100K计算机处理:
monolingual语言:
cs任务:
文本分类The dataset contains user reviews from Czech eshop <mall.cz> Each review contains text, sentiment (positive/negative/neutral), and automatically-detected language (mostly Czech, occasionaly Slovak) using lingua-py The dataset has in total (train+validation+test) 30,000 reviews. The data is balanced.
Train set has 8000 positive, 8000 neutral and 8000 negative reviews. Validation and test set each have 1000 positive, 1000 neutral and 1000 negative reviews.
Each sample contains:
The data is a processed adaptation of Mall CZ corpus . The adaptation is label-balanced and adds automatically-detected language