TL;DR本研究提出了一种基于Hierarchical Character Tagger模型的短文本拼写错误纠正方法,该模型使用预训练的字符级语言模型作为文本编码器,并提出了一种分层多任务解码方法来缓解长尾标签分布问题。实验证明,HCTagger模型比许多现有模型更准确,速度更快。
Abstract
State-of-the-art approaches to spelling error correction problem include transformer-based seq2seq models, which require large training sets and suffer from slow inference time; and →