BriefGPT.xyz
Mar, 2024
通过颜色感知代替绕过LLM水印
Bypassing LLM Watermarks with Color-Aware Substitutions
HTML
PDF
Qilong Wu, Varun Chandrasekaran
TL;DR
水印技术的方法可以用来识别是否为人类生成的文本还是大型语言模型(LLM)生成的文本。我们提出了一种第一种“颜色感知”攻击方法,可以成功地逃避水印的检测,并且能够移除任意长的水印文本。
Abstract
watermarking approaches
are proposed to identify if text being circulated is human or
large language model
(LLM) generated. The state-of-the-art watermarking strategy of Kirchenbauer et al. (2023a) biases the LLM
→