tinkoff-ai/response-quality-classifier-large | ATYUN.COM 官网-人工智能教程资讯全方位服务平台

模型:

tinkoff-ai/response-quality-classifier-large

任务:

文本分类

类库:

PyTorch Transformers

语言:

其他:

roberta 对话

许可:

mit

模型介绍文件清单

中文

This classification model is based on sberbank-ai/ruRoberta-large . The model should be used to produce relevance and specificity of the last message in the context of a dialogue.

The labels explanation:

relevance : is the last message in the dialogue relevant in the context of the full dialogue.
specificity : is the last message in the dialogue interesting and promotes the continuation of the dialogue.

It is pretrained on a large corpus of dialog data in unsupervised manner: the model is trained to predict whether last response was in a real dialog, or it was pulled from some other dialog at random. Then it was finetuned on manually labelled examples (dataset will be posted soon).

The model was trained with three messages in the context and one response. Each message was tokenized separately with max_length = 32 .

The performance of the model on validation split (dataset will be posted soon) (with the best thresholds for validation samples):

threshold	f0.5	ROC AUC
relevance	0.59	0.86	0.83
specificity	0.61	0.85	0.86

How to use:

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained('tinkoff-ai/response-quality-classifier-large')
model = AutoModelForSequenceClassification.from_pretrained('tinkoff-ai/response-quality-classifier-large')
inputs = tokenizer('[CLS]привет[SEP]привет![SEP]как дела?[RESPONSE_TOKEN]норм, у тя как?', max_length=128, add_special_tokens=False, return_tensors='pt')
with torch.inference_mode():
    logits = model(**inputs).logits
    probas = torch.sigmoid(logits)[0].cpu().detach().numpy()
relevance, specificity = probas

The app where you can easily interact with this model.

The work was done during internship at Tinkoff by egoriyaa , mentored by solemn-leader .

作者:

Tinkoff AI

数据集大小:

1.33 GB