BriefGPT.xyz
Aug, 2017
多语言OCR的序列到标签脚本识别
Sequence-to-Label Script Identification for Multilingual OCR
HTML
PDF
Yasuhisa Fujii, Karel Driesen, Jonathan Baccash, Ash Hurst, Ashok C. Popat
TL;DR
该研究提出了一种新型的线级别手写体识别方法,并将线级别手写体识别问题重新构建为序列标签问题,使用编码器和汇总器训练端到端解决该问题,并在扫描书籍和照片中测试,在30种书写系统和232种语言中,相比传统方法,提高了16%的正确率,并减少了因识别错误导致的33%的字符错误率。
Abstract
We describe a novel line-level
script identification
method. Previous work repurposed an OCR model generating per-character script codes, counted to obtain line-level
script identification
. This has two shortcomi
→