BriefGPT.xyz
Jun, 2021
机器学习类型的带噪随机梯度下降。第二部分:连续时间分析
Stochastic gradient descent with noise of machine learning type. Part II: Continuous time analysis
HTML
PDF
Stephan Wojtowytsch
TL;DR
通过随机梯度下降和先进的基于随机梯度下降的算法找到人工神经网络的适当参数,优化算法在目标函数的某种噪声区域内倾向于选择“平坦”最小值,这一趋势与连续时间SGD与均匀噪声的选择是不同的。
Abstract
The representation of functions by
artificial neural networks
depends on a large number of parameters in a non-linear fashion. Suitable parameters of these are found by minimizing a '
loss functional
', typically b
→