模型:

arpanghoshal/EmoRoBERTa

中文

Connect me on LinkedIn

What is GoEmotions

Dataset labelled 58000 Reddit comments with 28 emotions

  • admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise + neutral

What is RoBERTa

RoBERTa builds on BERT’s language masking strategy and modifies key hyperparameters in BERT, including removing BERT’s next-sentence pretraining objective, and training with much larger mini-batches and learning rates. RoBERTa was also trained on an order of magnitude more data than BERT, for a longer amount of time. This allows RoBERTa representations to generalize even better to downstream tasks compared to BERT.

Hyperparameters

Parameter
Learning rate 5e-5
Epochs 10
Max Seq Length 50
Batch size 16
Warmup Proportion 0.1
Epsilon 1e-8

Results

Best Result of Macro F1 - 49.30%

Usage

from transformers import RobertaTokenizerFast, TFRobertaForSequenceClassification, pipeline

tokenizer = RobertaTokenizerFast.from_pretrained("arpanghoshal/EmoRoBERTa")
model = TFRobertaForSequenceClassification.from_pretrained("arpanghoshal/EmoRoBERTa")

emotion = pipeline('sentiment-analysis', 
                    model='arpanghoshal/EmoRoBERTa')

emotion_labels = emotion("Thanks for using it.")
print(emotion_labels)

Output

[{'label': 'gratitude', 'score': 0.9964383244514465}]