THUDM/glm-10b-chinese | ATYUN.COM 官网-人工智能教程资讯全方位服务平台

模型:

THUDM/glm-10b-chinese

任务:

特征提取

类库:

PyTorch Transformers

语言:

其他:

glm custom_code thudm

预印本库:

arxiv:2103.10360

模型介绍文件清单

英文

GLM是一种使用自回归填空目标进行预训练的通用语言模型，可以在各种自然语言理解和生成任务上进行微调。

有关GLM的详细描述，请参阅我们的论文：

GLM: General Language Model Pretraining with Autoregressive Blank Infilling (ACL 2022)

都政孝*，钱宇杰*，刘霄，丁明，邱杰忠，杨志霖，唐杰 (*：等同贡献)

在我们的 Github repo 中可以找到更多示例。

模型描述

glm-10b-chinese 在 WuDaoCorpora 数据集上进行了预训练。模型有48个Transformer层，每层的隐藏大小为4096，每层有64个注意力头。模型使用了专为自然语言理解、序列到序列和语言建模设计的自回归填空目标进行了预训练。

如何使用

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("BAAI/glm-10b-chinese", trust_remote_code=True)
model = AutoModelForSeq2SeqLM.from_pretrained("BAAI/glm-10b-chinese", trust_remote_code=True)
model = model.half().cuda()

inputs = tokenizer("凯旋门位于意大利米兰市古城堡旁。1807年为纪念[MASK]而建，门高25米，顶上矗立两武士青铜古兵车铸像。", return_tensors="pt")
inputs = tokenizer.build_inputs_for_generation(inputs, max_gen_length=512)
inputs = {key: value.cuda() for key, value in inputs.items()}
outputs = model.generate(**inputs, max_length=512, eos_token_id=tokenizer.eop_token_id)
print(tokenizer.decode(outputs[0].tolist()))

我们使用三个不同的掩码标记来处理不同的任务：[MASK] 用于短空白填充，[sMASK] 用于句子填充，[gMASK] 用于从左到右生成。你可以从 here 中找到有关不同掩码的示例。

引用

如果您发现这段代码对您的研究有用，请引用我们的论文：

@article{DBLP:conf/acl/DuQLDQY022,
  author    = {Zhengxiao Du and
               Yujie Qian and
               Xiao Liu and
               Ming Ding and
               Jiezhong Qiu and
               Zhilin Yang and
               Jie Tang},
  title     = {{GLM:} General Language Model Pretraining with Autoregressive Blank Infilling},
  booktitle = {Proceedings of the 60th Annual Meeting of the Association for Computational
               Linguistics (Volume 1: Long Papers), {ACL} 2022, Dublin, Ireland,
               May 22-27, 2022},
  pages     = {320--335},
  publisher = {Association for Computational Linguistics},
  year      = {2022},
}

作者:

Data Mining Research Group at Tsinghua University

数据集大小:

18.4 GB