BriefGPT.xyz
Jul, 2023
回溯优化器:k 步回溯,1 步前进
Lookbehind Optimizer: k steps back, 1 step forward
HTML
PDF
Gonçalo Mordido, Pranshu Malviya, Aristide Baratin, Sarath Chandar
TL;DR
通过结合Lookahead优化器和锐度感知最小化技术,使用Lookbehind方法对深度神经网络进行训练,实现更好的稳定性和损失锐度的权衡,以提高泛化性能、鲁棒性和遗忘容忍度。
Abstract
The
lookahead optimizer
improves the training stability of
deep neural networks
by having a set of fast weights that "look ahead" to guide the descent direction. Here, we combine this idea with
→