LayoutLM

多模态（文本 + 布局/格式 + 图像）的文档AI预训练

Microsoft Document AI | GitHub

模型描述

LayoutLM是一种简单但有效的文本和布局的预训练方法，用于文档图像理解和信息抽取任务，例如表单理解和收据理解。LayoutLM在多个数据集上取得了最先进的结果。更多详情，请参阅我们的论文：

LayoutLM: Pre-training of Text and Layout for Document Image Understanding Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, KDD 2020

不同的分词器

请注意，LayoutLM-Cased需要一个基于RobertaTokenizer的不同分词器。您可以按如下方式初始化它：

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('microsoft/layoutlm-base-cased')

引用

如果您在研究中发现LayoutLM有用，请引用以下论文：

@misc{xu2019layoutlm,
    title={LayoutLM: Pre-training of Text and Layout for Document Image Understanding},
    author={Yiheng Xu and Minghao Li and Lei Cui and Shaohan Huang and Furu Wei and Ming Zhou},
    year={2019},
    eprint={1912.13318},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

作者:

Microsoft

数据集大小:

492.52 MB