flax-community/gpt2-base-thai | ATYUN.COM 官网-人工智能教程资讯全方位服务平台

模型:

flax-community/gpt2-base-thai

任务:

文本生成

类库:

PyTorch JAX TensorBoard Transformers

数据集:

oscar 3Aoscar

语言:

其他:

gpt2 gpt2-base-thai text-generation-inference

许可:

mit

模型介绍文件清单

中文

GPT-2 Base Thai

GPT-2 Base Thai is a causal language model based on the OpenAI GPT-2 model. It was trained on the OSCAR dataset, specifically the unshuffled_deduplicated_th subset. The model was trained from scratch and achieved an evaluation loss of 1.708 and an evaluation perplexity of 5.516.

This model was trained using HuggingFace's Flax framework and is part of the JAX/Flax Community Week organized by HuggingFace. All training was done on a TPUv3-8 VM, sponsored by the Google Cloud team.

All necessary scripts used for training could be found in the Files and versions tab, as well as the Training metrics logged via Tensorboard.

Model

Model	#params	Arch.	Training/Validation data (text)
gpt2-base-thai	124M	GPT-2	unshuffled_deduplicated_th Dataset

Evaluation Results

The model was trained for 3 epochs and the following is the final result once the training ended.

train loss	valid loss	valid PPL	total time
1.638	1.708	5.516	6:12:34

How to Use

As Causal Language Model

from transformers import pipeline

pretrained_name = "flax-community/gpt2-base-thai"

nlp = pipeline(
    "text-generation",
    model=pretrained_name,
    tokenizer=pretrained_name
)

nlp("สวัสดีตอนเช้า")

Feature Extraction in PyTorch

from transformers import GPT2Model, GPT2TokenizerFast

pretrained_name = "flax-community/gpt2-base-thai"
model = GPT2Model.from_pretrained(pretrained_name)
tokenizer = GPT2TokenizerFast.from_pretrained(pretrained_name)

prompt = "สวัสดีตอนเช้า"
encoded_input = tokenizer(prompt, return_tensors='pt')
output = model(**encoded_input)

Team Members

Sakares Saengkaew ( @sakares )
Wilson Wongso ( @w11wo )

作者:

Flax Community

数据集大小:

973.31 MB