BriefGPT.xyz
Dec, 2020
从大型语言模型中提取训练数据
Extracting Training Data from Large Language Models
HTML
PDF
Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss...
TL;DR
本研究论文在大规模语言模型训练及私人数据集保护之间发现了一定的矛盾,由此提出了一种通过查询语言模型进行训练数据提取的攻击方法,并以GPT-2为例证,能够成功地提取训练数据中的个人信息、代码等敏感信息,这也提示着训练数据的隐私和安全问题,需要进一步的技术防范措施。
Abstract
It has become common to publish large (billion parameter)
language models
that have been trained on private datasets. This paper demonstrates that in such settings, an adversary can perform a
training data extraction at
→