This paper reports on the use of prompt engineering and GPT-3.5 for biomedical query-focused multi-document summarisation. Using GPT-3.5 and appropriate prompts, our system achieves top ROUGE-F1 results in the task of obtaining short-paragraph-sized answers to biomedical questions in the 2023 BioASQ Challenge (BioASQ 11b). This paper confirms what has been observed in other domains: 1) Prompts that incorporated few-shot samples generally improved on their counterpart zero-shot variants; 2) The largest improvement was achieved by retrieval augmented generation. The fact that these prompts allow our top runs to rank within the top two runs of BioASQ 11b demonstrate the power of using adequate prompts for Large Language Models in general, and GPT-3.5 in particular, for query-focused summarisation.

使用提示工程和GPT-3.5进行生物医学问题聚焦多文档摘要，我们的系统在2023BioASQ挑战中通过GPT-3.5和适当的提示获得了最佳的ROUGE-F1结果。这篇论文证实了在其他领域观察到的事实：纳入少样本的提示通常优于对应的零样本变体；检索增强生成实现了最大的改进。这些提示使得我们的最佳运行结果在BioASQ11b排名前两位，证明了在一般情况下，使用适当的提示对于大语言模型以及GPT-3.5在问题聚焦摘要中的强大作用。

大型语言模型与查询工程在生物医学多文档摘要中的应用