BriefGPT.xyz
Jan, 2024
训练时间攻击的自适应折扣
Adaptive Discounting of Training Time Attacks
HTML
PDF
Ridhima Bector, Abhay Aradhya, Chai Quek, Zinovi Rabinovich
TL;DR
通过开发一种名为gammaDDPG的DDPG算法的特殊版本,我们展示了一种更强版本的构建性训练时攻击(C-TTA),即使目标行为由于环境动态和与受害者目标的非最优性而不可采纳。
Abstract
Among the most insidious attacks on
reinforcement learning
(RL) solutions are
training-time attacks
(TTAs) that create loopholes and backdoors in the learned behaviour. Not limited to a simple disruption,
→