模型:
RUCAIBox/mvp
MVP 模型是由 Tianyi Tang、Junyi Li、Wayne Xin Zhao 和 Ji-Rong Wen 在 MVP: Multi-task Supervised Pre-training for Natural Language Generation 年提出的。
可以在 https://github.com/RUCAIBox/MVP 找到详细的信息和说明。
MVP 是使用混合标记数据集进行有监督预训练的,它遵循标准的 Transformer 编码器-解码器架构。
MVP 是专门为自然语言生成而设计的,可以适应各种生成任务,包括但不限于摘要、数据到文本生成、开放式对话系统、故事生成、问答、问题生成、任务导向对话系统、常识生成、释义生成、文本风格转换和文本简化。我们的模型也可以适应自然语言理解任务,如序列分类和(抽取式)问答。
对于摘要:
>>> from transformers import MvpTokenizer, MvpForConditionalGeneration
>>> tokenizer = MvpTokenizer.from_pretrained("RUCAIBox/mvp")
>>> model = MvpForConditionalGeneration.from_pretrained("RUCAIBox/mvp")
>>> inputs = tokenizer(
... "Summarize: You may want to stick it to your boss and leave your job, but don't do it if these are your reasons.",
... return_tensors="pt",
... )
>>> generated_ids = model.generate(**inputs)
>>> tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
["Why You Shouldn't Quit Your Job"]
对于数据到文本生成:
>>> from transformers import MvpTokenizerFast, MvpForConditionalGeneration
>>> tokenizer = MvpTokenizerFast.from_pretrained("RUCAIBox/mvp")
>>> model = MvpForConditionalGeneration.from_pretrained("RUCAIBox/mvp")
>>> inputs = tokenizer(
... "Describe the following data: Iron Man | instance of | Superhero [SEP] Stan Lee | creator | Iron Man",
... return_tensors="pt",
... )
>>> generated_ids = model.generate(**inputs)
>>> tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
['Stan Lee created the character of Iron Man, a fictional superhero appearing in American comic']
MVP : https://huggingface.co/RUCAIBox/mvp .
基于提示的模型 :
多任务模型 :
@article{tang2022mvp,
title={MVP: Multi-task Supervised Pre-training for Natural Language Generation},
author={Tang, Tianyi and Li, Junyi and Zhao, Wayne Xin and Wen, Ji-Rong},
journal={arXiv preprint arXiv:2206.12131},
year={2022},
url={https://arxiv.org/abs/2206.12131},
}