通过原始-对偶分析表征隐式偏差

Jun, 2019

A refined primal-dual analysis of the implicit bias

Ziwei Ji, Matus Telgarsky

TL;DR本文证明了对于线性可分数据，梯度下降的隐式偏差可以通过最优解的双重优化问题完全描述，从而实现了对一般损失的训练。此外，使用 L2 最大间隔方向的恒定步长可以获得 O(ln(n)/ln(t)) 的收敛速率，而使用适当选择的主动步长时间表，则可以获得对于L2间隔和隐式偏差的 O(1/t) 收敛速率。

Abstract

Recent work shows that gradient descent on linearly separable data is implicitly biased towards the maximum margin solution. However, no convergence rate which is tight in both n (the dataset size) and t (the tra