``The right to be forgotten'' ensured by laws for user data privacy becomes increasingly important. Machine unlearning aims to efficiently remove the effect of certain data points on the trained model parameters so that it can be approximately the same as if one retrains the model from scratch. This work proposes stochastic gradient Langevin unlearning, the first unlearning framework based on noisy stochastic gradient descent (SGD) with privacy guarantees for approximate unlearning problems under convexity assumption. Our results show that mini-batch gradient updates provide a superior privacy-complexity trade-off compared to the full-batch counterpart. There are numerous algorithmic benefits of our unlearning approach, including complexity saving compared to retraining, and supporting sequential and batch unlearning. To examine the privacy-utility-complexity trade-off of our method, we conduct experiments on benchmark datasets compared against prior works. Our approach achieves a similar utility under the same privacy constraint while using $2\%$ and $10\%$ of the gradient computations compared with the state-of-the-art gradient-based approximate unlearning methods for mini-batch and full-batch settings, respectively.

将“被遗忘的权利”作为用户数据隐私的法律保证变得日益重要。机器遗忘旨在高效地从训练模型参数中去除特定数据点的影响，以便与从头开始重新训练模型时近似相同。本文提出了基于噪声随机梯度下降（SGD）的随机梯度Langevin遗忘框架，为凸性假设下的近似遗忘问题提供了带有隐私保证的第一个遗忘方法。我们的研究结果表明，与全批次更新相比，小批次梯度更新提供了更好的隐私-复杂性权衡。我们的遗忘方法具有许多算法上的优势，包括与重新训练相比的复杂性节省，以及支持顺序和批次遗忘。为了研究我们方法的隐私-效用-复杂性权衡，我们在基准数据集上进行了实验，与之前的工作进行了比较。相比于小批次和全批次设置下基于梯度的近似遗忘方法，我们的方法在相同隐私限制条件下使用了2%和10%的梯度计算，同时达到了类似的效用。

随机梯度 Langevin 反学习