TL;DR本文展示了在三项预测任务中使用音素单元和 DAU 分词的优势,包括字素到音素、字素到 DAU 和使用 DAU 语言建模的无监督语音生成,并且证明了分词在性能、训练和推理速度上的显著改进,同时提供了理论解释。
Abstract
tokenization algorithms that merge the units of a base vocabulary into
larger, variable-rate units have become standard in natural language processing
tasks. This idea, however, has been mostly overlooked when the vocabulary
consists of phonemes or Discrete Acoustic Units (DAUs), an au