The aim of this study is to investigate Machine Unlearning (MU), a burgeoning field focused on addressing concerns related to neural models inadvertently retaining personal or sensitive data. Here, a novel approach is introduced to achieve precise and selective forgetting within language models. Unlike previous methodologies that adopt completely opposing training objectives, this approach aims to mitigate adverse effects on language model performance, particularly in generation tasks. Furthermore, two innovative evaluation metrics are proposed: Sensitive Information Extraction Likelihood (S-EL) and Sensitive Information Memory Accuracy (S-MA), designed to gauge the effectiveness of sensitive information elimination. To reinforce the forgetting framework, an effective method for annotating sensitive scopes is presented, involving both online and offline strategies. The online selection mechanism leverages language probability scores to ensure computational efficiency, while the offline annotation entails a robust two-stage process based on Large Language Models (LLMs).

该研究旨在调查机器遗忘（MU），这是一个新兴领域，专注于解决神经模型意外保留个人或敏感数据的问题。本文介绍了一种新的方法，以实现语言模型内的精确选择性遗忘，并提出了两个创新的评估指标：敏感信息提取可能性（S-EL）和敏感信息记忆准确性（S-MA），用于衡量消除敏感信息的有效性。为了加强遗忘框架，提出了一种有效的敏感区域注释方法，包括在线和离线策略。在线选择机制利用语言概率得分确保计算效率，而离线注释则采用基于大型语言模型（LLMs）的强大两阶段过程。

选择性遗忘：推进机器遗忘技术和语言模型评估