The entry of large language models (LLMs) into research and commercial spaces has led to a trend of ever-larger models, with initial promises of generalisability, followed by a widespread desire to downsize and create specialised models without the need for complete fine-tuning, using Parameter Efficient Fine-tuning (PEFT) methods. We present an investigation into the suitability of different PEFT methods to clinical decision-making tasks, across a range of model sizes, including extremely small models with as few as $25$ million parameters. Our analysis shows that the performance of most PEFT approaches varies significantly from one task to another, with the exception of LoRA, which maintains relatively high performance across all model sizes and tasks, typically approaching or matching full fine-tuned performance. The effectiveness of PEFT methods in the clinical domain is evident, particularly for specialised models which can operate on low-cost, in-house computing infrastructure. The advantages of these models, in terms of speed and reduced training costs, dramatically outweighs any performance gain from large foundation LLMs. Furthermore, we highlight how domain-specific pre-training interacts with PEFT methods and model size, and discuss how these factors interplay to provide the best efficiency-performance trade-off. Full code available at: tbd.

对不同规模的模型以及临床决策任务的适用性进行研究，揭示大型语言模型的效果与Parameter Efficient Fine-tuning方法的关系，发现LoRA方法在各项任务和模型规模下都能保持较高的性能，专用模型在速度和训练成本上具有优越性，与大型基础语言模型相比效果更好，同时探讨了领域特定预训练与PEFT方法和模型规模之间的相互影响，以及提供最佳效率与性能平衡的因素。

大规模效率：探究微型语言模型在临床任务中的性能