Large language models (LLMs) can use in-context demonstrations to improve performance on zero-shot tasks. However, selecting the best in-context examples is challenging because model performance can vary widely depending on the selected examples. We present a cross-entropy difference (CED) method for selecting in-context demonstrations. Our method is based on the observation that the effectiveness of in-context demonstrations negatively correlates with the perplexity of the test example by a language model that was finetuned on that demonstration. We utilize parameter efficient finetuning to train small models on training data that are used for computing the cross-entropy difference between a test example and every candidate in-context demonstration. This metric is used to rank and select in-context demonstrations independently for each test input. We evaluate our method on a mix-domain dataset that combines 8 benchmarks, representing 4 text generation tasks, showing that CED for in-context demonstration selection can improve performance for a variety of LLMs.

本研究提出了一个基于交叉熵差异的方法，用于选择上下文演示文稿以提高语言模型的性能。该方法基于该观察结果：在特定演示文稿上进行微调的语言模型在测试样例上的困惑度与上下文演示的有效性呈负相关。研究者评估了该方法在混合域数据集上的表现，并表明该方法可提高各种大型语言模型的性能。

基于交叉熵差异的场景内演示选择