模型:
bigscience/bloomz-3b
我们提供了BLOOMZ&mT0模型系列,这是一组能够在几十种语言中执行人类指令的模型,且具备跨语言零射击能力。我们在事先训练的多语言BLOOM&mT5模型上进行微调,使用跨语言任务混合(xP3)进行微调,发现得到的模型能够对未见过的任务和语言进行跨语言泛化。
Multitask finetuned on 1239321 . Recommended for prompting in English. | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Parameters | 300M | 580M | 1.2B | 3.7B | 13B | 560M | 1.1B | 1.7B | 3B | 7.1B | 176B |
Finetuned Model | 12310321 | 12311321 | 12312321 | 12313321 | 12314321 | 12315321 | 12316321 | 12317321 | 12318321 | 12319321 | 12320321 |
Multitask finetuned on 12321321 . Recommended for prompting in non-English. | |||||||||||
Finetuned Model | 12322321 | 12323321 | 12324321 | ||||||||
Multitask finetuned on 12325321 . Released for research purposes only. Strictly inferior to above models! | |||||||||||
Finetuned Model | 12326321 | 12327321 | 12328321 | ||||||||
Original pretrained checkpoints. Not recommended. | |||||||||||
Pretrained Model | 12329321 | 12330321 | 12331321 | 12332321 | 12333321 | 12334321 | 12335321 | 12336321 | 12337321 | 12338321 | 12339321 |
我们建议使用该模型执行自然语言表达的任务。例如,给定提示“将其翻译成英语:Je t’aime。”,该模型很可能会回答“我爱你。”我们在论文中提供了一些提示的想法:
欢迎在社区选项卡中分享您的创作!
# pip install -q transformers from transformers import AutoModelForCausalLM, AutoTokenizer checkpoint = "bigscience/bloomz-3b" tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint) inputs = tokenizer.encode("Translate to English: Je t’aime.", return_tensors="pt") outputs = model.generate(inputs) print(tokenizer.decode(outputs[0]))
# pip install -q transformers accelerate from transformers import AutoModelForCausalLM, AutoTokenizer checkpoint = "bigscience/bloomz-3b" tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype="auto", device_map="auto") inputs = tokenizer.encode("Translate to English: Je t’aime.", return_tensors="pt").to("cuda") outputs = model.generate(inputs) print(tokenizer.decode(outputs[0]))
# pip install -q transformers accelerate bitsandbytes from transformers import AutoModelForCausalLM, AutoTokenizer checkpoint = "bigscience/bloomz-3b" tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto", load_in_8bit=True) inputs = tokenizer.encode("Translate to English: Je t’aime.", return_tensors="pt").to("cuda") outputs = model.generate(inputs) print(tokenizer.decode(outputs[0]))
提示工程:性能可能会因提示而异。对于BLOOMZ模型,我们建议在输入结束时明确指出,以避免模型尝试继续输入。例如,提示“将其翻译成英语:Je t'aime”如果没有结束的句点(.),可能会导致模型尝试继续翻译法语句子。更好的提示示例包括“将其翻译成英语:Je t'aime。”,“将其翻译成英语:Je t'aime。翻译:”,“Je t'aime在英语中是什么意思?”等,这样对于模型何时给出答案就很明确。此外,我们建议尽可能向模型提供更多的上下文。例如,如果您希望它用泰卢固语回答,请告诉模型,例如“用泰卢固语的一句话解释神经网络中的反向传播是什么。”
我们参考我们的论文 paper & bigscience/evaluation-results 中的表格7,以获取对未见任务的零射击结果。侧边栏报告了每个数据集配置的最佳提示的零射击性能。
@article{muennighoff2022crosslingual, title={Crosslingual generalization through multitask finetuning}, author={Muennighoff, Niklas and Wang, Thomas and Sutawika, Lintang and Roberts, Adam and Biderman, Stella and Scao, Teven Le and Bari, M Saiful and Shen, Sheng and Yong, Zheng-Xin and Schoelkopf, Hailey and others}, journal={arXiv preprint arXiv:2211.01786}, year={2022} }