数据集:
tyqiangz/multilingual-sentiments
任务:
文本分类许可:
apache-2.0A collection of multilingual sentiments datasets grouped into 3 classes -- positive, neutral, negative.
Most multilingual sentiment datasets are either 2-class positive or negative, 5-class ratings of products reviews (e.g. Amazon multilingual dataset) or multiple classes of emotions. However, to an average person, sometimes positive, negative and neutral classes suffice and are more straightforward to perceive and annotate. Also, a positive/negative classification is too naive, most of the text in the world is actually neutral in sentiment. Furthermore, most multilingual sentiment datasets don't include Asian languages (e.g. Malay, Indonesian) and are dominated by Western languages (e.g. English, German).
Git repo: https://github.com/tyqiangz/multilingual-sentiment-datasets