BriefGPT.xyz
Oct, 2021
GPT压缩的Kronecker分解
Kronecker Decomposition for GPT Compression
HTML
PDF
Ali Edalati, Marzieh Tahaei, Ahmad Rashid, Vahid Partovi Nia, James J. Clark...
TL;DR
本研究使用 Kronecker分解压缩GPT-22模型的线性映射,并使用该技术训练得到一种新型的神经语言模型KnGPT2,该模型在经过有效预训练后,可在具有相同参数数量的情况下,优于现有的DistilGPT2模型,在语言建模和通用语言理解评估基准任务上均取得了显著的成绩。
Abstract
gpt
is an auto-regressive Transformer-based
pre-trained language model
which has attracted a lot of attention in the natural language processing (NLP) domain due to its state-of-the-art performance in several dow
→