Automatic Speech Recognition (ASR) traditionally assumes known domains, but
adding data from a new domain raises concerns about computational
inefficiencies linked to retraining models on both existing and new domains.
Fine-tuning solely on new domain risks Catastrophic Forgetting (CF). To address
this, Lifelong Learning (LLL) algorithms have been proposed for ASR. Prior
research has explored techniques such as Elastic Weight Consolidation,
Knowledge Distillation, and Replay, all of which necessitate either additional
parameters or access to prior domain data. We propose Sequential Model Editing
as a novel method to continually learn new domains in ASR systems. Different
than previous methods, our approach does not necessitate access to prior
datasets or the introduction of extra parameters. Our study demonstrates up to
15% Word Error Rate Reduction (WERR) over fine-tuning baseline, and superior
efficiency over other LLL techniques on CommonVoice English multi-accent
dataset.

通过提出一种名为连续模型编辑的新方法，本研究解决了自动语音识别系统中的领域迁移问题，并实验证明相对于基准微调和其他深度学习算法，该方法在减少词错误率和提高识别效率方面取得了显著的优势。

语音识别模型的终身训练的连续编辑

Sequential Editing for Lifelong Training of Speech Recognition Models

Recently, while large language models (LLMs) have demonstrated impressive
results, they still suffer from hallucination, i.e., the generation of false
information. Model editing is the task of fixing factual mistakes in LLMs; yet,
most previous works treat it as a one-time task, paying little attention to
ever-emerging mistakes generated by LLMs. We address the task of sequential
model editing (SME) that aims to rectify mistakes continuously. A Dynamic
Auxiliary Fusion Network (DAFNet) is designed to enhance the semantic
interaction among the factual knowledge within the entire sequence, preventing
catastrophic forgetting during the editing process of multiple knowledge
triples. Specifically, (1) for semantic fusion within a relation triple, we
aggregate the intra-editing attention flow into auto-regressive self-attention
with token-level granularity in LLMs. We further leverage multi-layer diagonal
inter-editing attention flow to update the weighted representations of the
entire sequence-level granularity. (2) Considering that auxiliary parameters
are required to store the knowledge for sequential editing, we construct a new
dataset named \textbf{DAFSet}, fulfilling recent, popular, long-tail and robust
properties to enhance the generality of sequential editing. Experiments show
DAFNet significantly outperforms strong baselines in single-turn and sequential
editing. The usage of DAFSet also consistently improves the performance of
other auxiliary network-based methods in various scenarios

大型语言模型中的模型编辑任务，设计了动态辅助融合网络（DAFNet）以提升语义交互，并使用新构建的数据集 DAFSet 进行连续编辑，实验证明 DAFNet 在单轮和连续编辑任务中显著优于其他方法。

DAFNet：大语言模型中序贴模型编辑的动态辅助融合

DAFNet: Dynamic Auxiliary Fusion for Sequential Model Editing in Large  Language Models

Large Transformer-based Pretrained Language Models (PLMs) dominate almost all
Natural Language Processing (NLP) tasks. Nevertheless, they still make mistakes
from time to time. For a model deployed in an industrial environment, fixing
these mistakes quickly and robustly is vital to improve user experiences.
Previous works formalize such problems as Model Editing (ME) and mostly focus
on fixing one mistake. However, the one-mistake-fixing scenario is not an
accurate abstraction of the real-world challenge. In the deployment of AI
services, there are ever-emerging mistakes, and the same mistake may recur if
not corrected in time. Thus a preferable solution is to rectify the mistakes as
soon as they appear nonstop. Therefore, we extend the existing ME into
Sequential Model Editing (SME) to help develop more practical editing methods.
Our study shows that most current ME methods could yield unsatisfying results
in this scenario. We then introduce Transformer-Patcher, a novel model editor
that can shift the behavior of transformer-based models by simply adding and
training a few neurons in the last Feed-Forward Network layer. Experimental
results on both classification and generation tasks show that
Transformer-Patcher can successively correct up to thousands of errors
(Reliability) and generalize to their equivalent inputs (Generality) while
retaining the model's accuracy on irrelevant inputs (Locality). Our method
outperforms previous fine-tuning and HyperNetwork-based methods and achieves
state-of-the-art performance for Sequential Model Editing (SME). The code is
available at this https URL.

本研究提出一种被称为 Transformer-Patcher 的神经网络模型，能够通过简单地添加和训练最后一层前馈网络中的少量神经元，连续纠正长序列中的错误，达到了顺序模型编辑（SME）的最优表现，解决了工业环境中部署的模型如何快速准确地修正错误问题。