JungleLee/bert-toxic-comment-classification | ATYUN.COM 官网-人工智能教程资讯全方位服务平台

模型:

JungleLee/bert-toxic-comment-classification

任务:

文本分类

类库:

PyTorch Transformers

语言:

其他:

bert

许可:

afl-3.0

模型介绍文件清单

中文

Model description

This model is a fine-tuned version of the bert-base-uncased model to classify toxic comments.

How to use

You can use the model with the following code.

from transformers import BertForSequenceClassification, BertTokenizer, TextClassificationPipeline

model_path = "JungleLee/bert-toxic-comment-classification"
tokenizer = BertTokenizer.from_pretrained(model_path)
model = BertForSequenceClassification.from_pretrained(model_path, num_labels=2)

pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)
print(pipeline("You're a fucking nerd."))

Training data

The training data comes from this Kaggle competition . We use 90% of the train.csv data to train the model.

Evaluation results

The model achieves 0.95 AUC in a 1500 rows held-out test set.

作者:

Jianguo Li

数据集大小:

418.4 MB