Recent data-extraction attacks have exposed that language models can memorize some training samples verbatim. This is a vulnerability that can compromise the privacy of the model's training data. In this work, we introduce SubMix: a practical protocol for private next-token prediction designed to prevent privacy violations by language models that were fine-tuned on a private corpus after pre-training on a public corpus. We show that SubMix limits the leakage of information that is unique to any individual user in the private corpus via a relaxation of group differentially private prediction. Importantly, SubMix admits a tight, data-dependent privacy accounting mechanism, which allows it to thwart existing data-extraction attacks while maintaining the utility of the language model. SubMix is the first protocol that maintains privacy even when publicly releasing tens of thousands of next-token predictions made by large transformer-based models such as GPT-2.

本文介绍了 SubMix，这是一种防止语言模型泄漏私有语料库信息的实用协议，其通过对预测结果进行一定程度的差分隐私处理，限制了个人用户的信息泄露，并保证了语言模型的效用。SubMix 是第一个可以在公开发布成千上万次基于 GPT-2 等大型 transformer 模型的预测结果时仍能维护隐私的协议。

Submix: 大规模语言模型的实用私密预测