BriefGPT.xyz
Apr, 2025
数据哨兵:一种博弈论检测提示注入攻击的方法
DataSentinel: A Game-Theoretic Detection of Prompt Injection Attacks
HTML
PDF
Yupei Liu, Yuqi Jia, Jinyuan Jia, Dawn Song, Neil Zhenqiang Gong
TL;DR
该研究解决了现有提示注入攻击检测方法效果有限的问题,提出了一种新颖的博弈论方法DataSentinel,能够检测经过策略性适配的输入。研究表明,DataSentinel能够有效识别现有及自适应提示注入攻击,展示了其在防护中的潜在影响。
Abstract
LLM-integrated applications and agents are vulnerable to
Prompt Injection
attacks, where an attacker injects prompts into their inputs to induce attacker-desired outputs. A
Detection
method aims to determine whet
→