BriefGPT.xyz
Apr, 2020
通过微调子词系统实现合理大小的基于字符级别的Transformer NMT
Towards Character-Level Transformer NMT by Finetuning Subword Systems
HTML
PDF
Jindřich Libovický, Alexander Fraser
TL;DR
实现字符级别的Transformer架构通常需要非常深的架构,难以训练。本文提出一种通过在模型中将分词与字元结合进行初步训练,然后在字符级别上调整,从而实现不需要分词的神经机器翻译模型的方法,并且展示了这种方法更好地捕捉了语言形态现象和更加健壮,训练的代价相对较小。
Abstract
Applying the
transformer architecture
on the
character level
usually requires very deep architectures that are difficult and slow to train. A few approaches have been proposed that partially overcome this problem
→