Differentially Private Stochastic Gradient Descent (DP-SGD) and its variants have been proposed to ensure rigorous privacy for fine-tuning large-scale pre-trained language models. However, they rely heavily on the Gaussian mechanism, which may overly perturb the gradients and degrade the accuracy, especially in stronger privacy regimes (e.g., the privacy budget $\epsilon < 3$). To address such limitations, we propose a novel Language Model-based Optimal Differential Privacy (LMO-DP) mechanism, which takes the first step to enable the tight composition of accurately fine-tuning (large) language models with a sub-optimal DP mechanism, even in strong privacy regimes (e.g., $0.1\leq \epsilon<3$). Furthermore, we propose a novel offline optimal noise search method to efficiently derive the sub-optimal DP that significantly reduces the noise magnitude. For instance, fine-tuning RoBERTa-large (with 300M parameters) on the SST-2 dataset can achieve an accuracy of 92.20% (given $\epsilon=0.3$, $\delta=10^{-10}$) by drastically outperforming the Gaussian mechanism (e.g., $\sim 50\%$ for small $\epsilon$ and $\delta$). We also draw similar findings on the text generation tasks on GPT-2. Finally, to our best knowledge, LMO-DP is also the first solution to accurately fine-tune Llama-2 with strong differential privacy guarantees. The code will be released soon and available upon request.

通过提出一种新的基于语言模型的最优差分隐私（LMO-DP）机制，我们可以在强隐私环境下使用亚优差分隐私机制来准确微调大规模语言模型，并提出了一种离线最优噪声搜索方法来降低噪声幅度。通过大大优于高斯机制的性能，在SST-2数据集上，对具有300M参数的RoBERTa-large进行微调可以实现92.20%的准确率（给定ε=0.3，δ=10^-10），类似的结果也在GPT-2的文本生成任务中发现。此外，基于我们的了解，LMO-DP是第一个具有良好差分隐私保证的准确微调Llama-2的解决方案。

LMO-DP: 为巨型语言模型优化差分隐私微调的随机化机制