数据集:
dengue_filipino
任务:
文本分类语言:
tl计算机处理:
monolingual大小:
1K<n<10K语言创建人:
crowdsourced源数据集:
original许可:
license:unknownBenchmark dataset for low-resource multiclass classification, with 4,015 training, 500 testing, and 500 validation examples, each labeled as part of five classes. Each sample can be a part of multiple classes. Collected as tweets.
[More Information Needed]
The dataset is primarily in Filipino, with the addition of some English words commonly used in Filipino vernacular.
Sample data:
{ "text": "Tapos ang dami pang lamok.", "absent": "0", "dengue": "0", "health": "0", "mosquito": "1", "sick": "0" }
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
Jan Christian Cruz
[More Information Needed]
@INPROCEEDINGS{8459963, author={E. D. {Livelo} and C. {Cheng}}, booktitle={2018 IEEE International Conference on Agents (ICA)}, title={Intelligent Dengue Infoveillance Using Gated Recurrent Neural Learning and Cross-Label Frequencies}, year={2018}, volume={}, number={}, pages={2-7}, doi={10.1109/AGENTS.2018.8459963}} }
Thanks to @anaerobeth for adding this dataset.