BriefGPT.xyz
Jun, 2023
基于偏好的语言模型微调的令牌级指导
Preference-grounded Token-level Guidance for Language Model Fine-tuning
HTML
PDF
Shentao Yang, Shujian Zhang, Congying Xia, Yihao Feng, Caiming Xiong...
TL;DR
本文提出了一种新的训练语言模型的方式,通过将序列级别的偏好导向到令牌级别的训练中,然后再利用所学到的指导来改善LM,实现在不同任务中的竞争性表现。
Abstract
Aligning
language models
(LMs) with preferences is an important problem in
natural language generation
. A key challenge is that preferences are typically provided at the sequence level while LM training and gener
→