文本生成的更好LLM评估器：提示输出排序和优化的影响

Jun, 2024

A Better LLM Evaluator for Text Generation: The Impact of Prompt Output Sequencing and Optimization

KuanChao Chu, Yi-Pei Chen, Hideki Nakayama

TL;DR通过研究大型语言模型的评估生成文本的提示设计，本研究发现不同提示结构和包含解释性原因的顺序对语言模型评分有重要影响，进而提出了优化评分一致性的方法。

Abstract

This research investigates prompt designs of evaluating generated texts using large language models (LLMs). While LLMs are increasingly us