Large Language Models (LLMs) have in recent years demonstrated impressive prowess in natural language generation. A common practice to improve generation diversity is to sample multiple outputs from the model. However, there lacks a simple and robust way of selecting the best output from these stochastic samples. As a case study framed in the context of question generation, we propose two prompt-based approaches to selecting high-quality questions from a set of LLM-generated candidates. Our method works under the constraints of 1) a black-box (non-modifiable) question generation model and 2) lack of access to human-annotated references -- both of which are realistic limitations for real-world deployment of LLMs. With automatic as well as human evaluations, we empirically demonstrate that our approach can effectively select questions of higher qualities than greedy generation.

本文提出了两种基于提示的方法，以选择自然语言生成模型生成的高质量问题，旨在解决多样性提高与模型选择的问题。经过自动化和人工评估，结果表明，与贪婪算法相比，我们的方法能够有效地选择更高质量的问题。

从预训练LLMs中更好地选择样本：以生成问题为例的案例研究