Aligning Large Language Models (LLMs) with general human preferences has been proved crucial in improving the interaction quality between LLMs and human. However, human values are inherently diverse among different individuals, making it insufficient to align LLMs solely with general preferences. To address this, personalizing LLMs according to individual feedback emerges as a promising solution. Nonetheless, this approach presents challenges in terms of the efficiency of alignment algorithms. In this work, we introduce a flexible paradigm for individual preference alignment. Our method fundamentally improves efficiency by disentangling preference representation from text generation in LLMs. We validate our approach across multiple text generation tasks and demonstrate that it can produce aligned quality as well as or better than PEFT-based methods, while reducing additional training time for each new individual preference by $80\%$ to $90\%$ in comparison with them.

本研究解决了大语言模型（LLMs）与个体人类偏好对齐效率不足的问题。我们提出了一种灵活的个体偏好对齐范式，通过将偏好表示与文本生成分离，显著提高了对齐算法的效率。实验表明，我们的方法在文本生成任务中展示了优于基于PEFT的方法，并且在对每个新个体偏好所需的额外训练时间上降低了80%至90%。

解构偏好表示与文本生成以实现高效的个体偏好对齐