数据集:
clarin-pl/polemo2-official
任务:
文本分类语言:
pl计算机处理:
monolingual语言创建人:
other批注创建人:
expert-generated源数据集:
original许可:
cc-by-sa-4.0The PolEmo2.0 is a dataset of online consumer reviews from four domains: medicine, hotels, products, and university. It is human-annotated on a level of full reviews and individual sentences. Current version (PolEmo 2.0) contains 8,216 reviews having 57,466 sentences. Each text and sentence was manually annotated with sentiment in the 2+1 scheme, which gives a total of 197,046 annotations. About 85% of the reviews are from the medicine and hotel domains. Each review is annotated with four labels: positive, negative, neutral, or ambiguous.
The task is to predict the correct label of the review.
Input (' text ' column): sentence
Output (' target ' column): label for sentence sentiment ('zero': neutral, 'minus': negative, 'plus': positive, 'amb': ambiguous)
Domain : Online reviews
Measurements : Accuracy, F1 Macro
Example :
Input: Na samym wejściu hotel śmierdzi . W pokojach jest pleśń na ścianach , brudny dywan . W łazience śmierdzi chemią , hotel nie grzeje w pokojach panuje chłód . Wyposażenie pokoju jest stare , kran się rusza , drzwi na balkon nie domykają się . Jedzenie jest w małych ilościach i nie smaczne . Nie polecam nikomu tego hotelu .
Input (translated by DeepL): At the very entrance the hotel stinks . In the rooms there is mold on the walls , dirty carpet . The bathroom smells of chemicals , the hotel does not heat in the rooms are cold . The room furnishings are old , the faucet moves , the door to the balcony does not close . The food is in small quantities and not tasty . I would not recommend this hotel to anyone .
Output: 1 (negative)
Subset | Cardinality |
---|---|
train | 6573 |
val | 823 |
test | 820 |
Class | train | dev | test |
---|---|---|---|
minus | 0.3756 | 0.3694 | 0.4134 |
plus | 0.2775 | 0.2868 | 0.2768 |
amb | 0.1991 | 0.1883 | 0.1659 |
zero | 0.1477 | 0.1555 | 0.1439 |
@inproceedings{kocon-etal-2019-multi, title = "Multi-Level Sentiment Analysis of {P}ol{E}mo 2.0: Extended Corpus of Multi-Domain Consumer Reviews", author = "Koco{\'n}, Jan and Mi{\l}kowski, Piotr and Za{\'s}ko-Zieli{\'n}ska, Monika", booktitle = "Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)", month = nov, year = "2019", address = "Hong Kong, China", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/K19-1092", doi = "10.18653/v1/K19-1092", pages = "980--991", abstract = "In this article we present an extended version of PolEmo {--} a corpus of consumer reviews from 4 domains: medicine, hotels, products and school. Current version (PolEmo 2.0) contains 8,216 reviews having 57,466 sentences. Each text and sentence was manually annotated with sentiment in 2+1 scheme, which gives a total of 197,046 annotations. We obtained a high value of Positive Specific Agreement, which is 0.91 for texts and 0.88 for sentences. PolEmo 2.0 is publicly available under a Creative Commons copyright license. We explored recent deep learning approaches for the recognition of sentiment, such as Bi-directional Long Short-Term Memory (BiLSTM) and Bidirectional Encoder Representations from Transformers (BERT).", }
Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
from pprint import pprint from datasets import load_dataset dataset = load_dataset("clarin-pl/polemo2-official") pprint(dataset['train'][0]) # {'target': 1, # 'text': 'Na samym wejściu hotel śmierdzi . W pokojach jest pleśń na ścianach ' # ', brudny dywan . W łazience śmierdzi chemią , hotel nie grzeje w ' # 'pokojach panuje chłód . Wyposażenie pokoju jest stare , kran się ' # 'rusza , drzwi na balkon nie domykają się . Jedzenie jest w małych ' # 'ilościach i nie smaczne . Nie polecam nikomu tego hotelu .'}
import random from pprint import pprint from datasets import load_dataset, load_metric dataset = load_dataset("clarin-pl/polemo2-official") references = dataset["test"]["target"] # generate random predictions predictions = [random.randrange(max(references) + 1) for _ in range(len(references))] acc = load_metric("accuracy") f1 = load_metric("f1") acc_score = acc.compute(predictions=predictions, references=references) f1_score = f1.compute(predictions=predictions, references=references, average='macro') pprint(acc_score) pprint(f1_score) # {'accuracy': 0.2475609756097561} # {'f1': 0.23747048177471738}