BriefGPT.xyz
Feb, 2024
HD-Eval: 通过分层准则分解对齐大型语言模型评估器
HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition
HTML
PDF
Yuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang...
TL;DR
通过将任务分解为更细粒度的评估标准,然后根据人类偏好进行聚合和修剪,HD-Eval框架提供了一种改进LLM评估器对人类喜好的对齐的方法,并在多个层次上全面捕捉自然语言的方面。
Abstract
large language models
(LLMs) have emerged as a promising alternative to expensive human
evaluations
. However, the
alignment
and coverage o
→