模型:
RUCAIBox/mvp
MVP 模型是由 Tianyi Tang、Junyi Li、Wayne Xin Zhao 和 Ji-Rong Wen 在 MVP: Multi-task Supervised Pre-training for Natural Language Generation 年提出的。
可以在 https://github.com/RUCAIBox/MVP 找到详细的信息和说明。
MVP 是使用混合标记数据集进行有监督预训练的,它遵循标准的 Transformer 编码器-解码器架构。
MVP 是专门为自然语言生成而设计的,可以适应各种生成任务,包括但不限于摘要、数据到文本生成、开放式对话系统、故事生成、问答、问题生成、任务导向对话系统、常识生成、释义生成、文本风格转换和文本简化。我们的模型也可以适应自然语言理解任务,如序列分类和(抽取式)问答。
对于摘要:
>>> from transformers import MvpTokenizer, MvpForConditionalGeneration >>> tokenizer = MvpTokenizer.from_pretrained("RUCAIBox/mvp") >>> model = MvpForConditionalGeneration.from_pretrained("RUCAIBox/mvp") >>> inputs = tokenizer( ... "Summarize: You may want to stick it to your boss and leave your job, but don't do it if these are your reasons.", ... return_tensors="pt", ... ) >>> generated_ids = model.generate(**inputs) >>> tokenizer.batch_decode(generated_ids, skip_special_tokens=True) ["Why You Shouldn't Quit Your Job"]
对于数据到文本生成:
>>> from transformers import MvpTokenizerFast, MvpForConditionalGeneration >>> tokenizer = MvpTokenizerFast.from_pretrained("RUCAIBox/mvp") >>> model = MvpForConditionalGeneration.from_pretrained("RUCAIBox/mvp") >>> inputs = tokenizer( ... "Describe the following data: Iron Man | instance of | Superhero [SEP] Stan Lee | creator | Iron Man", ... return_tensors="pt", ... ) >>> generated_ids = model.generate(**inputs) >>> tokenizer.batch_decode(generated_ids, skip_special_tokens=True) ['Stan Lee created the character of Iron Man, a fictional superhero appearing in American comic']
MVP : https://huggingface.co/RUCAIBox/mvp .
基于提示的模型 :
多任务模型 :
@article{tang2022mvp, title={MVP: Multi-task Supervised Pre-training for Natural Language Generation}, author={Tang, Tianyi and Li, Junyi and Zhao, Wayne Xin and Wen, Ji-Rong}, journal={arXiv preprint arXiv:2206.12131}, year={2022}, url={https://arxiv.org/abs/2206.12131}, }