数据集:
covid_tweets_japanese
任务:
文本分类子任务:
fact-checking语言:
ja计算机处理:
monolingual大小:
10K<n<100K语言创建人:
found批注创建人:
crowdsourced源数据集:
original许可:
cc-by-nd-4.053,640 Japanese tweets with annotation if a tweet is related to COVID-19 or not. The annotation is by majority decision by 5 - 10 crowd workers. Target tweets include "COVID" or "コロナ". The period of the tweets is from around January 2020 to around June 2020. The original tweets are not contained. Please use Twitter API to get them, for example.
Text-classification, Whether the tweet is related to COVID-19, and whether it is fact or opinion.
The text can be gotten using the IDs in this dataset is Japanese, posted on Twitter.
CSV file with the 1st column is Twitter ID and the 2nd column is assessment option ID.
No articles have been published for this dataset, and it appears that the author of the dataset is willing to publish an article (it is not certain that the splitting information will be included). Therefore, at this time, information on data splits is not provided.
[More Information Needed] because the paper is not yet published.
53,640 Japanese tweets with annotation if a tweet is related to COVID-19 or not. Target tweets include "COVID" or "コロナ". The period of the tweets is from around January 2020 to around June 2020.
Who are the source language producers?The language producers are users of Twitter.
The annotation is by majority decision by 5 - 10 crowd workers.
Who are the annotators?Crowd workers.
The author does not contain original tweets.
[More Information Needed]
[More Information Needed]
[More Information Needed]
The dataset is hosted by Suzuki Laboratory, Gifu University, Japan.
CC-BY-ND 4.0
A related paper has not yet published. The author shows how to cite as「鈴木 優: COVID-19 日本語 Twitter データセット ( http://www.db.info.gifu-u.ac.jp/data/Data_5f02db873363f976fce930d1 ) 」.
Thanks to @forest1988 for adding this dataset.