模型:
mrm8488/xlm-roberta-base-finetuned-HC3-mix
任务:
文本分类语言:
multilingual数字对象标识符:
10.57967/hf/0306其他:
xlm-roberta预印本库:
arxiv:2301.07597许可:
openrailXLM-RoBERTa (base) fine-tuned on Hello-SimpleAI HC3 corpus for ChatGPT text detection.
All credit to Hello-SimpleAI for their huge work!
XLM-RoBERTa model pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. It was introduced in the paper Unsupervised Cross-lingual Representation Learning at Scale by Conneau et al. and first released in this repository.
The first human-ChatGPT comparison corpus, named HC3 dataset by Hello-SimpleAI
This dataset is introduced in the paper:
metric | value |
---|---|
F1 | 0.9736 |
from transformers import pipeline ckpt = "mrm8488/xlm-roberta-base-finetuned-HC3-mix" detector = pipeline('text-classification', model=ckpt) text = "Here your text..." result = detector(text) print(result)
@misc {manuel_romero_2023, author = { {Manuel Romero} }, title = { xlm-roberta-base-finetuned-HC3-mix (Revision b18de48) }, year = 2023, url = { https://huggingface.co/mrm8488/xlm-roberta-base-finetuned-HC3-mix }, doi = { 10.57967/hf/0306 }, publisher = { Hugging Face } }