BriefGPT.xyz
May, 2021
StructuralLM:面向形式理解的结构化预训练
StructuralLM: Structural Pre-training for Form Understanding
HTML
PDF
Chenliang Li, Bin Bi, Ming Yan, Wei Wang, Songfang Huang...
TL;DR
本文提出了一种新的预训练方法 StructuralLM,可以联合使用扫描文档中的单元格和布局信息,在对下游自然语言处理任务进行微调时获得了新的最先进结果,包括通过分类单元格位置等两种新方法进行预处理。能够有效提高表格理解、文档可视化问答和文档图像分类方面的表现。
Abstract
Large pre-trained language models achieve
state-of-the-art results
when fine-tuned on downstream NLP tasks. However, they almost exclusively focus on text-only representation, while neglecting
cell-level layout informat
→