Retrieval-augmented generation (RAG) is a promising way to improve large
language models (LLMs) for generating more factual, accurate, and up-to-date
content. Existing methods either optimize prompts to guide LLMs in leveraging
retrieved information or directly fine-tune the LLMs to adapt to RAG scenarios.
Although fine-tuning can yield better performance, it often compromises the
LLMs' general generation capabilities by modifying their parameters. This
limitation poses challenges in practical applications, especially when LLMs are
already deployed, as parameter adjustments may affect their original
functionality. To address this, we propose a novel method that involves
learning scalable and pluggable virtual tokens for RAG. By maintaining the
LLMs' original parameters and fine-tuning only the embeddings of these
pluggable tokens, our approach not only enhances LLMs' performance but also
preserves their general generation capacities. Furthermore, we design several
training strategies to improve the scalability, flexibility, and
generalizability of our method. Comprehensive experiments across nine
question-answering tasks demonstrate the superiority of our approach.

通过学习可扩展且可插拔的虚拟标记，我们的方法在保持大型语言模型的原始参数的基础上，仅对这些可插入标记的嵌入进行微调，从而提高了大型语言模型的性能并保留了其普遍的生成能力。

一个令牌可以帮助！学习可扩展和可插拔的虚拟令牌用于增强检索的大型语言模型

One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for  Retrieval-Augmented Large Language Models

Distributional Reinforcement Learning (RL) estimates return distribution
mainly by learning quantile values via minimizing the quantile Huber loss
function, entailing a threshold parameter often selected heuristically or via
hyperparameter search, which may not generalize well and can be suboptimal.
This paper introduces a generalized quantile Huber loss function derived from
Wasserstein distance (WD) calculation between Gaussian distributions, capturing
noise in predicted (current) and target (Bellman-updated) quantile values.
Compared to the classical quantile Huber loss, this innovative loss function
enhances robustness against outliers. Notably, the classical Huber loss
function can be seen as an approximation of our proposed loss, enabling
parameter adjustment by approximating the amount of noise in the data during
the learning process. Empirical tests on Atari games, a common application in
distributional RL, and a recent hedging strategy using distributional RL,
validate the effectiveness of our proposed loss function and its potential for
parameter adjustments in distributional RL.

通过广义分位数 Huber 损失函数从高斯分布之间的 Wasserstein 距离计算出噪声，本文提出了一种广义的分位数 Huber 损失函数，主要用于在分布性强化学习中估计回报分布。与经典分位数 Huber 损失相比，该创新损失函数增强了对异常值的鲁棒性，且经过实证测试验证了其在 Atari 游戏和最新对冲策略中应用于分布性强化学习的效果以及在参数调整中的潜力。