模型:
stabilityai/stablelm-tuned-alpha-3b
任务:
文本生成数据集:
dmayhem93/ChatCombined tatsu-lab/alpaca nomic-ai/gpt4all_prompt_generations Dahoas/full-hh-rlhf jeffwan/sharegpt_vicuna HuggingFaceH4/databricks_dolly_15k 3AHuggingFaceH4/databricks_dolly_15k 3Ajeffwan/sharegpt_vicuna 3ADahoas/full-hh-rlhf 3Anomic-ai/gpt4all_prompt_generations 3Atatsu-lab/alpaca 3Admayhem93/ChatCombined语言:
en许可:
cc-by-nc-sa-4.0StableLM-Tuned-Alpha 是构建在 StableLM-Base-Alpha 模型之上的一系列包括3B和7B参数的仅解码语言模型,并在多个聊天和指令跟随数据集上进一步微调。
您可以使用以下代码片段开始与 StableLM-Tuned-Alpha 进行聊天:
from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList tokenizer = AutoTokenizer.from_pretrained("StabilityAI/stablelm-tuned-alpha-7b") model = AutoModelForCausalLM.from_pretrained("StabilityAI/stablelm-tuned-alpha-7b") model.half().cuda() class StopOnTokens(StoppingCriteria): def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool: stop_ids = [50278, 50279, 50277, 1, 0] for stop_id in stop_ids: if input_ids[0][-1] == stop_id: return True return False system_prompt = """<|SYSTEM|># StableLM Tuned (Alpha version) - StableLM is a helpful and harmless open-source AI language model developed by StabilityAI. - StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user. - StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes. - StableLM will refuse to participate in anything that could harm a human. """ prompt = f"{system_prompt}<|USER|>What's your mood today?<|ASSISTANT|>" inputs = tokenizer(prompt, return_tensors="pt").to("cuda") tokens = model.generate( **inputs, max_new_tokens=64, temperature=0.7, do_sample=True, stopping_criteria=StoppingCriteriaList([StopOnTokens()]) ) print(tokenizer.decode(tokens[0], skip_special_tokens=True))
StableLM Tuned 应该使用按照 ...... 格式进行的提示。系统提示是
<|SYSTEM|># StableLM Tuned (Alpha version) - StableLM is a helpful and harmless open-source AI language model developed by StabilityAI. - StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user. - StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes. - StableLM will refuse to participate in anything that could harm a human.
Parameters | Hidden Size | Layers | Heads | Sequence Length |
---|---|---|---|---|
3B | 4096 | 16 | 32 | 4096 |
7B | 6144 | 16 | 48 | 4096 |
StableLM-Tuned-Alpha 模型是在五个数据集的基础上进行微调的: Alpaca ,由OpenAI的text-davinci-003引擎生成的52000个指令和演示数据集; GPT4All Prompt Generations ,包含由GPT-4生成的400k个提示和回答; Anthropic HH ,由AI助手的有益性和无害性偏好组成; DataBricks Dolly ,包含InstructGPT论文中Databricks员工在能力领域中生成的15000个指令/回答,包括头脑风暴、分类、闭合型问答、生成、信息提取、开放型问答和摘要;以及 ShareGPT Vicuna (English subset) ,从 ShareGPT 检索到的谈话数据集。
模型通过在上述数据集上进行监督式微调来学习,使用混合精度(FP16)进行训练,并使用AdamW进行优化。我们概述以下超参数:
Parameters | Batch Size | Learning Rate | Warm-up | Weight Decay | Betas |
---|---|---|---|---|---|
3B | 256 | 2e-5 | 50 | 0.01 | (0.9, 0.99) |
7B | 128 | 2e-5 | 100 | 0.01 | (0.9, 0.99) |
这些模型旨在由开源社区的聊天类应用程序使用,符合 CC BY-NC-SA-4.0 许可证的规定。
尽管上述数据集有助于引导基础语言模型生成“更安全”的文本分布,但并非所有偏见和有害性都可以通过微调来减轻。我们要求用户在生成的回答中注意潜在的问题。请勿将模型输出视为人类判断的替代品或真理的来源。请谨慎使用。
感谢 Dakota Mahan( @dmayhem93 )的帮助。
@misc{alpaca, author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto }, title = {Stanford Alpaca: An Instruction-following LLaMA model}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}}, }
@misc{vicuna2023, title = {Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality}, url = {https://vicuna.lmsys.org}, author = {Chiang, Wei-Lin and Li, Zhuohan and Lin, Zi and Sheng, Ying and Wu, Zhanghao and Zhang, Hao and Zheng, Lianmin and Zhuang, Siyuan and Zhuang, Yonghao and Gonzalez, Joseph E. and Stoica, Ion and Xing, Eric P.}, month = {March}, year = {2023} }
@misc{gpt4all, author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar}, title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/nomic-ai/gpt4all}}, }