We study the data deletion problem for convex models. By leveraging techniques from convex optimization and reservoir sampling, we give the first data deletion algorithms that are able to handle an arbitrarily long sequence of adversarial updates while promising both per-deletion run-time and steady-state error that do not grow with the length of the update sequence. We also introduce several new conceptual distinctions: for example, we can ask that after a deletion, the entire state maintained by the optimization algorithm is statistically indistinguishable from the state that would have resulted had we retrained, or we can ask for the weaker condition that only the observable output is statistically indistinguishable from the observable output that would have resulted from retraining. We are able to give more efficient deletion algorithms under this weaker deletion criterion.

本文研究了凸模型的数据删除问题，通过利用凸优化和水库取样技术，我们提出了第一个能够处理任意长的对抗性更新序列的数据删除算法，并承诺每次删除的运行时间和稳态误差均不随更新序列的长度而增长。此外，我们还引入了几个新的概念区分。我们可以要求删除后，整个优化算法维护的状态与重新训练应该获得的状态在统计上难以区分，也可以要求仅使可观测输出在统计上难以区分。在这种较弱的删除标准下，我们能够给出更有效的删除算法。

下降至删除: 基于梯度的机器遗忘方法