BriefGPT.xyz
Oct, 2020
DocStruct:一种多模态方法,用于提取文档中的层次结构,以实现通用表单理解
DocStruct: A Multimodal Method to Extract Hierarchy Structure in Document for General Form Understanding
HTML
PDF
Zilong Wang, Mingjie Zhan, Xuebo Liu, Ding Liang
TL;DR
本研究提出了一种基于多模态方法的表格理解框架,该框架可以有效地提取表格键值对,并应用于医疗表格和FUNSD等基准数据集上,实验证明方法的有效性。
Abstract
form understanding
depends on both textual contents and organizational structure. Although modern
ocr
performs well, it is still challenging to realize general
→