学习优化器的一般化方法

Jun, 2021

A Generalizable Approach to Learning Optimizers

Diogo Almeida, Clemens Winter, Jie Tang, Wojciech Zaremba

TL;DR从泛化为先的角度设计了一种系统，使用新颖的特征、行动和奖励函数学习更新优化器超参数，从而优化神经网络的泛化性能。该系统在所有神经网络任务上优于Adam，并在ImageNet上实现了2倍的加速，在使用比训练任务大5个数量级的计算资源的语言模型任务上实现了2.5倍的加速。

Abstract

A core issue with learning to optimize neural networks has been the lack of generalization to real world problems. To address this, we describe a system designed from a →