The exposure of large language models (LLMs) to copyrighted material during pre-training raises concerns about unintentional copyright infringement post deployment. This has driven the development of "copyright takedown" methods, post-training approaches aimed at preventing models from generating content substantially similar to copyrighted ones. While current mitigation approaches are somewhat effective for average-case risks, we demonstrate that they overlook worst-case copyright risks exhibits by the existence of long, verbatim quotes from copyrighted sources. We propose BloomScrub, a remarkably simple yet highly effective inference-time approach that provides certified copyright takedown. Our method repeatedly interleaves quote detection with rewriting techniques to transform potentially infringing segments. By leveraging efficient data sketches (Bloom filters), our approach enables scalable copyright screening even for large-scale real-world corpora. When quotes beyond a length threshold cannot be removed, the system can abstain from responding, offering certified risk reduction. Experimental results show that BloomScrub reduces infringement risk, preserves utility, and accommodates different levels of enforcement stringency with adaptive abstention. Our results suggest that lightweight, inference-time methods can be surprisingly effective for copyright prevention.

本研究解决了大型语言模型在预训练期间暴露于版权材料所引发的潜在版权侵犯问题。提出了一种名为BloomScrub的方法，通过巧妙结合引用检测和改写技术，能够有效识别和处理可能侵犯的内容，从而大幅降低版权侵犯风险。实验结果表明，该方法不仅有效地减少了侵犯风险，而且在不同的执法严格度下保持了实用性，显示出轻量级推理时方法在版权预防中的强大潜力。

经过认证的最坏情况大型语言模型版权侵犯的缓解