Pedro Sandoval-Segura, Vasu Singla, Jonas Geiping, Micah Goldblum, Tom Goldstein...
TL;DR该研究介绍了自回归(AR)中毒的方法,可以生成具有毒性的数据,而不需要访问更广泛的数据集,比起现有的不可学习方法,我们的 AR 毒药更加抵抗对抗训练以及强数据扩充等常见的防御。
Abstract
The prevalence of data scraping from social media as a means to obtain
datasets has led to growing concerns regarding unauthorized use of data. Data
poisoning attacks have been proposed as a bulwark against scraping, as they
make data "unlearnable" by adding small, imperceptible pertur