数据集:
dengue_filipino
任务:
语言:
计算机处理:
monolingual大小:
1K<n<10K语言创建人:
crowdsourced源数据集:
original许可:
Benchmark dataset for low-resource multiclass classification, with 4,015 training, 500 testing, and 500 validation examples, each labeled as part of five classes. Each sample can be a part of multiple classes. Collected as tweets.
[More Information Needed]
The dataset is primarily in Filipino, with the addition of some English words commonly used in Filipino vernacular.
Sample data:
{
"text": "Tapos ang dami pang lamok.",
"absent": "0",
"dengue": "0",
"health": "0",
"mosquito": "1",
"sick": "0"
}
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
Jan Christian Cruz
[More Information Needed]
@INPROCEEDINGS{8459963, author={E. D. {Livelo} and C. {Cheng}}, booktitle={2018 IEEE International Conference on Agents (ICA)}, title={Intelligent Dengue Infoveillance Using Gated Recurrent Neural Learning and Cross-Label Frequencies}, year={2018}, volume={}, number={}, pages={2-7}, doi={10.1109/AGENTS.2018.8459963}} }
Thanks to @anaerobeth for adding this dataset.