Longwei Zou, Qingyang Wang, Han Zhao, Jiangang Kong, Yi Yang...
TL;DR大规模语言模型中的并行计算减少推理延迟,提高性能。
Abstract
The fast-growing large scale language models are delivering unprecedented
performance on almost all natural language processing tasks. However, the
effectiveness of large language models are reliant on an exponentially
increasing number of parameters. The overwhelming computation compl