Fine-tuning large pretrained models on private datasets may run the risk of
violating privacy. Differential privacy is a framework for mitigating privacy
risks by enforcing algorithmic stability. DP-SGD enables training models with
private data in a privacy-preserving manner, but raises new obstacles in the
form of performance loss and significant engineering challenges. We introduce
DP-ZO, a new method for fine-tuning large language models that preserves the
privacy of training data by privatizing zeroth-order optimization. A key
insight into the design of our method is that the direction of the gradient in
SPSA, the zeroth-order algorithm we use, is always random and the only
information that depends on private data is the step size, i.e., a scalar.
Therefore, we only need to privatize the scalar step size, which is
memory-efficient. DP-ZO, which can be instantiated with either Laplace or
Gaussian noise, provides a strong privacy-utility trade-off across different
tasks, and model sizes, under conservative privacy budgets. One noteworthy
result is that DP-ZO exhibits just $1.86\%$ performance degradation due to
privacy at $(1,10^{-5})$-DP when fine-tuning OPT-66B on 1000 training samples
from SQuAD.

DP-ZO 是一种维护训练数据隐私的方法，通过对零阶优化中步长的隐私化来对大型语言模型进行微调，可在保守的隐私预算下提供强大的隐私 - 效用权衡，且在 SQuAD 的 1000 个训练样本上，对 OPT-66B 的微调仅导致 1.86% 的性能降低。