BriefGPT.xyz
Jan, 2024
Imperio: 通过语言指导的后门攻击实现任意模型控制
Imperio: Language-Guided Backdoor Attacks for Arbitrary Model Control
HTML
PDF
Ka-Ho Chow, Wenqi Wei, Lei Yu
TL;DR
这篇论文通过使用语言理解能力提升后门攻击对抗技术,控制受害模型并产生期望输出,有效且具弹性地攻击复杂数据集。
Abstract
Revolutionized by the
transformer architecture
,
natural language processing
(NLP) has received unprecedented attention. While advancements in NLP models have led to extensive research into their
→