数据集:
allegro_reviews
任务:
文本分类语言:
pl计算机处理:
monolingual大小:
10K<n<100K语言创建人:
found批注创建人:
found源数据集:
original许可:
cc-by-sa-4.0Allegro Reviews is a sentiment analysis dataset, consisting of 11,588 product reviews written in Polish and extracted from Allegro.pl - a popular e-commerce marketplace. Each review contains at least 50 words and has a rating on a scale from one (negative review) to five (positive review).
We recommend using the provided train/dev/test split. The ratings for the test set reviews are kept hidden. You can evaluate your model using the online evaluation tool available on klejbenchmark.com.
Product reviews sentiment analysis. https://klejbenchmark.com/leaderboard/
Polish
Two tsv files (train, dev) with two columns (text, rating) and one (test) with just one (text).
Data is splitted in train/dev/test split.
This dataset is one of nine evaluation tasks to improve polish language processing.
The Allegro Reviews is a set of product reviews from a popular e-commerce marketplace (Allegro.pl).
Who are the source language producers?Customers of an e-commerce marketplace.
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
Allegro Machine Learning Research team klejbenchmark@allegro.pl
Dataset licensed under CC BY-SA 4.0
@inproceedings{rybak-etal-2020-klej, title = "{KLEJ}: Comprehensive Benchmark for Polish Language Understanding", author = "Rybak, Piotr and Mroczkowski, Robert and Tracz, Janusz and Gawlik, Ireneusz", booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics", month = jul, year = "2020", address = "Online", publisher = "Association for Computational Linguistics", url = " https://www.aclweb.org/anthology/2020.acl-main.111" , pages = "1191--1201", }
Thanks to @abecadel for adding this dataset.