模型:
staka/fugumt-en-ja
任务:
翻译许可:
cc-by-sa-4.0This is a translation model using Marian-NMT. For more details, please see my repository .
This model uses transformers and sentencepiece.
!pip install transformers sentencepiece
You can use this model directly with a pipeline:
from transformers import pipeline fugu_translator = pipeline('translation', model='staka/fugumt-en-ja') fugu_translator('This is a cat.')
If you want to translate multiple sentences, we recommend using pySBD .
!pip install transformers sentencepiece pysbd import pysbd seg_en = pysbd.Segmenter(language="en", clean=False) from transformers import pipeline fugu_translator = pipeline('translation', model='staka/fugumt-en-ja') txt = 'This is a cat. It is very cute.' print(fugu_translator(seg_en.segment(txt)))
The results of the evaluation using tatoeba (randomly selected 500 sentences) are as follows:
source | target | BLEU(*1) |
---|---|---|
en | ja | 32.7 |
(*1) sacrebleu --tokenize ja-mecab