BriefGPT.xyz
Nov, 2018
数据并行对神经网络训练的影响测量
Measuring the Effects of Data Parallelism on Neural Network Training
HTML
PDF
Christopher J. Shallue, Jaehoon Lee, Joe Antognini, Jascha Sohl-Dickstein, Roy Frostig...
TL;DR
本文研究了增加批次大小对神经网络训练时间以及模型性能的影响,并发现不同的工作负载之间存在巨大的差异,并且不发现增加批次大小会降低模型的性能表现。
Abstract
Recent hardware developments have made unprecedented amounts of data parallelism available for accelerating
neural network
training. Among the simplest ways to harness next-generation accelerators is to increase the
bat
→