BriefGPT.xyz
Feb, 2024
大型语言模型的主动偏好学习
Active Preference Learning for Large Language Models
HTML
PDF
William Muldrew, Peter Hayes, Mingtian Zhang, David Barber
TL;DR
利用DPO进行喂养,通过预测语言模型的预测熵和由DPO优化的隐式优先级模型的确定性度量,我们开发了一种主动学习策略来更好地利用偏好标签,从而提高配对偏好数据的学习速率和最终性能。
Abstract
As
large language models
(LLMs) become more capable,
fine-tuning techniques
for aligning with human intent are increasingly important. A key consideration for aligning these models is how to most effectively use
→