Jun, 2024

AgentDojo:评估 LLM 智能体的攻击和防御的动态环境

TL;DRAI agents vulnerable to prompt injection attacks are evaluated for adversarial robustness using the AgentDojo framework, which includes realistic tasks, security test cases, and attack and defense paradigms, highlighting the need for new design principles to ensure reliable and robust performance.