BriefGPT.xyz
May, 2024
视觉语言模型的安全对齐
Safety Alignment for Vision Language Models
HTML
PDF
Zhendong Liu, Yuanbi Nie, Yingshui Tan, Xiangyu Yue, Qiushi Cui...
TL;DR
通过在两阶段训练过程中添加安全模块,包括安全投影仪、安全标记和安全头,我们提高了现有视觉语言模型的视觉安全对齐,有效提高了模型对危险图像的防御。
Abstract
Benefiting from the powerful capabilities of
large language models
(LLMs), pre-trained visual encoder models connected to an LLMs can realize
vision language models
(VLMs). However, existing research shows that t
→