BriefGPT.xyz
May, 2025
揭开画布的面纱:图像生成的动态基准测试
Unmasking the Canvas: A Dynamic Benchmark for Image Generation Jailbreaking and LLM Content Safety
HTML
PDF
Variath Madhupal Gautham Nair, Vishal Varma Dantuluri
TL;DR
本研究解决了现有大型语言模型在图像生成任务中内容安全性易受提示攻击的问题。我们提出了“揭开画布”基准(UTCB),利用结构化提示工程和多语言模糊化的方法,评估模型的脆弱性。该基准可动态更新,具有重要的安全评估和改进潜力。
Abstract
Existing large
Language Models
(LLMs) are advancing rapidly and produce outstanding results in
Image Generation
tasks, yet their
Content Safety
→