BriefGPT.xyz
Sep, 2024
MEOW:基于反向事实的记忆监督大语言模型遗忘
MEOW: MEMOry Supervised LLM Unlearning Via Inverted Facts
HTML
PDF
Tianle Gu, Kexin Huang, Ruilin Luo, Yuanqi Yao, Yujiu Yang...
TL;DR
本文针对大型语言模型(LM)可能记忆敏感信息的问题,提出了一种新的记忆遗忘方法MEOW,克服了传统方法的实用性、效率和鲁棒性问题。MEOW通过生成反向事实及MEMO量化记忆,能够在不显著损害模型效能的情况下显著提高遗忘质量,展示了其在自然语言理解和生成任务中的优势。
Abstract
Large Language Models
(LLMs) can memorize sensitive information, raising concerns about potential misuse. LLM
Unlearning
, a post-hoc approach to remove this information from trained LLMs, offers a promising solut
→