This work explores a novel data augmentation method based on Large Language Models (LLMs) for predicting item difficulty and response time of retired USMLE Multiple-Choice Questions (MCQs) in the BEA 2024 Shared Task. Our approach is based on augmenting the dataset with answers from zero-shot LLMs (Falcon, Meditron, Mistral) and employing transformer-based models based on six alternative feature combinations. The results suggest that predicting the difficulty of questions is more challenging. Notably, our top performing methods consistently include the question text, and benefit from the variability of LLM answers, highlighting the potential of LLMs for improving automated assessment in medical licensing exams. We make our code available https://github.com/ana-rogoz/BEA-2024.

本研究通过大型语言模型（LLM）的数据增强方法，预测BEA 2024共享任务中退休的USMLE多项选择题（MCQs）的题目难度和答题时间。我们的方法是通过从零样本LLM（Falcon，Meditron，Mistral）中添加答案来增强数据集，并使用基于六种不同特征组合的变压器模型。结果表明，预测问题的难度更具挑战性。值得注意的是，我们表现最佳的方法始终包括问题文本，并受益于LLM答案的多样性，突显了LLM在医疗执照考试自动评估中的潜力。我们将代码提供在此 https URL。

UnibucLLM: 利用语言模型自动预测多项选择题的难度和回答时间