模型:
clhuang/albert-sentiment
依據ckiplab/albert預訓練模型微調,訓練資料集只有8萬筆,做為課程的範例模型。
from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("clhuang/albert-sentiment") model = AutoModelForSequenceClassification.from_pretrained("clhuang/albert-sentiment") ## Pediction target_names=['Negative','Positive'] max_length = 200 # 最多字數 若超出模型訓練時的字數,以模型最大字數為依據 def get_sentiment_proba(text): # prepare our text into tokenized sequence inputs = tokenizer(text, padding=True, truncation=True, max_length=max_length, return_tensors="pt") # perform inference to our model outputs = model(**inputs) # get output probabilities by doing softmax probs = outputs[0].softmax(1) response = {'Negative': round(float(probs[0, 0]), 2), 'Positive': round(float(probs[0, 1]), 2)} # executing argmax function to get the candidate label #return probs.argmax() return response get_sentiment_proba('我喜歡這本書') get_sentiment_proba('不喜歡這款產品')