数据集:
oclar
许可:
license:unknown源数据集:
original批注创建人:
crowdsourced语言创建人:
crowdsourced大小:
1K<n<10K计算机处理:
monolingual语言:
ar任务:
文本分类The researchers of OCLAR Marwan et al. (2019), they gathered Arabic costumer reviews Zomato website on wide scope of domain, including restaurants, hotels, hospitals, local shops, etc. The corpus finally contains 3916 reviews in 5-rating scale. For this research purpose, the positive class considers rating stars from 5 to 3 of 3465 reviews, and the negative class is represented from values of 1 and 2 of about 451 texts.
Opinion Corpus for Lebanese Arabic Reviews (OCLAR) corpus is utilizable for Arabic sentiment classification on services reviews, including hotels, restaurants, shops, and others.
The text in the dataset is in Arabic, mainly in Lebanese (LB). The associated BCP-47 code is ar-LB .
A typical data point comprises a pagename which is the name of service / location being reviewed, a review which is the review left by the user / client , and a rating which is a score between 1 and 5.
The authors consider a review to be positive if the score is greater or equal than 3 , else it is considered negative.
An example from the OCLAR data set looks as follows:
"pagename": 'Ramlet Al Baida Beirut Lebanon', "review": 'مكان يطير العقل ويساعد على الاسترخاء', "rating": 5,
The data set comes in a single csv file of a total 3916 reviews :
This dataset was created for Arabic sentiment classification on services’ reviews in Lebanon country. Reviews are about public services, including hotels, restaurants, shops, and others.
The data was collected from Google Reviews and Zomato website
Who are the source language producers?The source language producers are people who posted their reviews on Google Reviews or Zomato website . They're mainly Arabic speaking Lebanese people.
The dataset does not contain any additional annotations
Who are the annotators?[More Information Needed]
[More Information Needed]
The author's research has tackled a highly important task of sentiment analysis for Arabic language in the Lebanese context on 3916 reviews’ services from Google and Zomato. Experiments show three main findings:
[More Information Needed]
[More Information Needed]
This dataset was curated by Marwan Al Omari, Moustafa Al-Hajj from Centre for Language Sciences and Communication, Lebanese University, Beirut, Lebanon; Nacereddine Hammami from college of Computer and Information Sciences, Jouf University, Aljouf, KSA; and Amani Sabra from Centre for Language Sciences and Communication, Lebanese University, Beirut, Lebanon.
[More Information Needed]
@misc{Dua:2019 , author = "Dua, Dheeru and Graff, Casey", year = "2017", title = "{UCI} Machine Learning Repository", url = "http://archive.ics.uci.edu/ml", institution = "University of California, Irvine, School of Information and Computer Sciences" } @InProceedings{AlOmari2019oclar, title = {Sentiment Classifier: Logistic Regression for Arabic Services Reviews in Lebanon}, authors={Al Omari, M., Al-Hajj, M., Hammami, N., & Sabra, A.}, year={2019} }
Thanks to @alaameloh for adding this dataset.