模型:

yiyanghkust/finbert-tone

任务:

文本分类

类库:

PyTorch TensorFlow Transformers

语言:

其他:

financial-sentiment-analysis sentiment-analysis

模型介绍文件清单

英文

FinBERT是在金融交流文本上预训练的BERT模型。其目的是增强金融自然语言处理的研究与实践。它在以下三个金融交流语料库上进行了训练，总共包含49亿个标记。

企业报告10-K & 10-Q: 25亿个标记
盈利电话会议文本: 13亿个标记
分析师报告: 11亿个标记

关于FinBERT的更多技术细节: Click Link

发布的finbert-tone模型是在1万个手动注释的（正面、负面、中性）分析师报告句子上对FinBERT模型进行微调的结果。该模型在金融情感分析任务上表现出优越性能。如果您只是想使用FinBERT进行金融情感分析，请尝试一下。

如果您在学术工作中使用该模型，请引用以下论文：

Huang, Allen H., Hui Wang, and Yi Yang. "FinBERT: A Large Language Model for Extracting Information from Financial Text." Contemporary Accounting Research (2022).

如何使用

您可以使用Transformers pipeline进行情感分析来使用该模型。

from transformers import BertTokenizer, BertForSequenceClassification
from transformers import pipeline

finbert = BertForSequenceClassification.from_pretrained('yiyanghkust/finbert-tone',num_labels=3)
tokenizer = BertTokenizer.from_pretrained('yiyanghkust/finbert-tone')

nlp = pipeline("sentiment-analysis", model=finbert, tokenizer=tokenizer)

sentences = ["there is a shortage of capital, and we need extra financing",  
             "growth is strong and we have plenty of liquidity", 
             "there are doubts about our finances", 
             "profits are flat"]
results = nlp(sentences)
print(results)  #LABEL_0: neutral; LABEL_1: positive; LABEL_2: negative

作者:

数据集大小:

837.93 MB