回溯优化器：k 步回溯，1 步前进

Jul, 2023

Lookbehind Optimizer: k steps back, 1 step forward

Gonçalo Mordido, Pranshu Malviya, Aristide Baratin, Sarath Chandar

TL;DR通过结合Lookahead优化器和锐度感知最小化技术，使用Lookbehind方法对深度神经网络进行训练，实现更好的稳定性和损失锐度的权衡，以提高泛化性能、鲁棒性和遗忘容忍度。

Abstract

The lookahead optimizer improves the training stability of deep neural networks by having a set of fast weights that "look ahead" to guide the descent direction. Here, we combine this idea with →