Large language models (LLMs) show an innate skill for solving language based tasks. But insights have suggested an inability to adjust for information or task-solving skills becoming outdated, as their knowledge, stored directly within their parameters, remains static in time. Tool use helps by offloading work to systems that the LLM can access through an interface, but LLMs that use them still must adapt to nonstationary environments for prolonged use, as new tools can emerge and existing tools can change. Nevertheless, tools require less specialized knowledge, therefore we hypothesize they are better suited for continual learning (CL) as they rely less on parametric memory for solving tasks and instead focus on learning when to apply pre-defined tools. To verify this, we develop a synthetic benchmark and follow this by aggregating existing NLP tasks to form a more realistic testing scenario. While we demonstrate scaling model size is not a solution, regardless of tool usage, continual learning techniques can enable tool LLMs to both adapt faster while forgetting less, highlighting their potential as continual learners.

大型语言模型具有解决语言相关任务的天赋，但由于它们静止于参数中的知识的局限性，存在无法应对信息变化和任务技能过时的问题。工具使用能帮助LLM通过接口获得外部系统的支持，但使用工具的LLM仍需适应不稳定的环境，并且需要学会使用预定义的工具。为验证这一观点，我们开发了一个合成基准并聚合了现有的自然语言处理任务，形成一个更加真实的测试场景。我们证明模型规模扩大并非解决方案，而不论是否使用工具，持续学习技术都能使工具型LLM更快适应并遗忘更少，凸显了它们作为持续学习者的潜力。

朝实用性工具使用的方向：为不断学习的LLMs而努力