关键词adversarially designed training examples
搜索结果 - 1
  • 微调对齐语言模型牺牲了安全性,即使用户并无此意!
    PDF9 months ago
Prev
Next