Advanced life forms, sustained by the synergistic interaction of neural
cognitive mechanisms, continually acquire and transfer knowledge throughout
their lifespan. In contrast, contemporary machine learning paradigms exhibit
limitations in emulating the facets of continual learning (CL). Nonetheless,
the emergence of large language models (LLMs) presents promising avenues for
realizing CL via interactions with these models. Drawing on Complementary
Learning System theory, this paper presents a novel Interactive Continual
Learning (ICL) framework, enabled by collaborative interactions among models of
various sizes. Specifically, we assign the ViT model as System1 and multimodal
LLM as System2. To enable the memory module to deduce tasks from class
information and enhance Set2Set retrieval, we propose the Class-Knowledge-Task
Multi-Head Attention (CKT-MHA). Additionally, to improve memory retrieval in
System1 through enhanced geometric representation, we introduce the CL-vMF
mechanism, based on the von Mises-Fisher (vMF) distribution. Meanwhile, we
introduce the von Mises-Fisher Outlier Detection and Interaction (vMF-ODI)
strategy to identify hard examples, thus enhancing collaboration between
System1 and System2 for complex reasoning realization. Comprehensive evaluation
of our proposed ICL demonstrates significant resistance to forgetting and
superior performance relative to existing methods.

基于互动持续学习框架，使用大型语言模型与记忆检索机制，以及模型之间的协作交互，实现抵抗遗忘和优越性能的持续学习。

互动式连续学习：快思和慢思

Interactive Continual Learning: Fast and Slow Thinking

We study the problem of continually training an instruction-following agent
through feedback provided by users during collaborative interactions. During
interaction, human users instruct an agent using natural language, and provide
realtime binary feedback as they observe the agent's instruction execution. We
cast learning as a contextual bandit problem, converting the user feedback to
immediate reward. We evaluate through multiple rounds of human-agent
interactions, demonstrating 15.4% absolute improvement in instruction execution
over time. We also show our approach is robust to several design variations,
and that the feedback signal is roughly equivalent to the learning signal of
supervised demonstration data.

通过人机协作交互提供的实时二元反馈，用自然语言训练指令遵从代理的问题被研究。将学习作为一种上下文医师问题，将用户反馈转换为立即奖励，证明了其在提高指令执行效果方面具有优势，并且反馈信号与监督式演示数据的学习信号基本等价。