Dynamically quantized DistilBERT base uncased finetuned SST-2

Model Details
How to Get Started With the Model

Model Details

Model Description: This model is a DistilBERT fine-tuned on SST-2 dynamically quantized with optimum-intel through the usage of huggingface/optimum-intel through the usage of Intel® Neural Compressor .

Model Type: Text Classification
Language(s): English
License: Apache-2.0
Parent Model: For more details on the original model, we encourage users to check out this model card.

How to Get Started With the Model

PyTorch

To load the quantized model, you can do as follows:

from optimum.intel.neural_compressor.quantization import IncQuantizedModelForSequenceClassification

model = IncQuantizedModelForSequenceClassification.from_pretrained("Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-dynamic")

ONNX

This is an INT8 ONNX model quantized with Intel® Neural Compressor .

The original fp32 model comes from the fine-tuned model DistilBERT .

Test result

INT8	FP32
Accuracy (eval-accuracy)	0.9025	0.9106
Model size (MB)	165	256

Load ONNX model:

from optimum.onnxruntime import ORTModelForSequenceClassification
model = ORTModelForSequenceClassification.from_pretrained('Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-dynamic')

作者:

Intel

数据集大小:

205.76 MB

Dynamically quantized DistilBERT base uncased finetuned SST-2

Table of Contents

Model Details

How to Get Started With the Model

PyTorch

ONNX