Feb, 2024
VL-Trojan: 自回归视觉语言模型的多模态指令后门攻击
VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models
Jiawei Liang, Siyuan Liang, Man Luo, Aishan Liu, Dongchen Han...
TL;DR通过 VL-Trojan 攻击,我们成功诱导目标输出,在推理过程中明显超过了基线(+62.52%),并且在各种模型规模和少样本上下文推理场景中展示了鲁棒性。