BriefGPT.xyz
Nov, 2024
LBPE:优先处理长令牌的分词方法以改善大型语言模型
LBPE: Long-token-first Tokenization to Improve Large Language Models
HTML
PDF
Haoran Lian, Yizhe Xiong, Zijia Lin, Jianwei Niu, Shasha Mo...
TL;DR
本研究解决了大型语言模型中长令牌频次不足导致学习不平衡的问题。提出的LBPE方法在编码过程中优先考虑长令牌,从而平衡短令牌和长令牌之间的频率差异。实验结果表明,LBPE在多种语言建模任务中表现优于传统的字节对编码(BPE),展示了其有效性。
Abstract
The prevalent use of
Byte Pair Encoding
(BPE) in Large
Language Models
(LLMs) facilitates robust handling of subword units and avoids issues of out-of-vocabulary words. Despite its success, a critical challenge p
→