模型:
mrm8488/spanbert-large-finetuned-squadv1
由 Facebook Research 创建,并在 SQuAD 1.1 上进行Q&A下游任务的精细调整( by them )。
SpanBERT: Improving Pre-training by Representing and Predicting Spans
您可以获取精细调整脚本 here
python code/run_squad.py \ --do_train \ --do_eval \ --model spanbert-large-cased \ --train_file train-v1.1.json \ --dev_file dev-v1.1.json \ --train_batch_size 32 \ --eval_batch_size 32 \ --learning_rate 2e-5 \ --num_train_epochs 4 \ --max_seq_length 512 \ --doc_stride 128 \ --eval_metric f1 \ --output_dir squad_output \ --fp16
| SQuAD 1.1 | SQuAD 2.0 | Coref | TACRED | |
|---|---|---|---|---|
| F1 | F1 | avg. F1 | F1 | |
| BERT (base) | 88.5* | 76.5* | 73.1 | 67.7 |
| SpanBERT (base) | 1239321 | 12310321 | 77.4 | 12311321 |
| BERT (large) | 91.3 | 83.3 | 77.1 | 66.4 |
| SpanBERT (large) | 94.6 (this) | 12312321 | 79.6 | 12313321 |
注意:带有*的数字是在开发集上评估的,因为这些模型没有提交到官方SQuAD排行榜。所有其他数字都是测试数字。
使用pipelines快速使用:
from transformers import pipeline
qa_pipeline = pipeline(
"question-answering",
model="mrm8488/spanbert-large-finetuned-squadv1",
tokenizer="SpanBERT/spanbert-large-cased"
)
qa_pipeline({
'context': "Manuel Romero has been working very hard in the repository hugginface/transformers lately",
'question': "How has been working Manuel Romero lately?"
})
# Output: {'answer': 'very hard in the repository hugginface/transformers',
'end': 82,
'score': 0.327230326857725,
'start': 31}
Made with ♥ in Spain