Hate-speech-CNERG/indic-abusive-allInOne-MuRIL | ATYUN.COM 官网-人工智能教程资讯全方位服务平台

模型:

Hate-speech-CNERG/indic-abusive-allInOne-MuRIL

任务:

文本分类

类库:

PyTorch Transformers

语言:

其他:

bert

预印本库:

arxiv:2204.12543

许可:

afl-3.0

模型介绍文件清单

中文

This model is used detecting abusive speech in Bengali, Devanagari Hindi, Code-mixed Hindi, Code-mixed Kannada, Code-mixed Malayalam, Marathi, Code-mixed Tamil, Urdu, Code-mixed Urdu, and English languages . The allInOne in the name refers to the Joint training/Cross-lingual training, where the model is trained using all the languages data. It is finetuned on MuRIL model. The model is trained with learning rates of 2e-5. Training code can be found at this url

LABEL_0 :-> Normal

LABEL_1 :-> Abusive

For more details about our paper

Mithun Das, Somnath Banerjee and Animesh Mukherjee. " Data Bootstrapping Approaches to Improve Low Resource Abusive Language Detection for Indic Languages ". Accepted at ACM HT 2022.

Please cite our paper in any published work that uses any of these resources.

@article{das2022data,
  title={Data Bootstrapping Approaches to Improve Low Resource Abusive Language Detection for Indic Languages},
  author={Das, Mithun and Banerjee, Somnath and Mukherjee, Animesh},
  journal={arXiv preprint arXiv:2204.12543},
  year={2022}
}

作者:

Hate-ALERT

数据集大小:

909.31 MB