The integration of Large Language Models (LLMs) into the healthcare domain has the potential to significantly enhance patient care and support through the development of empathetic, patient-facing chatbots. This study investigates an intriguing question Can ChatGPT respond with a greater degree of empathy than those typically offered by physicians? To answer this question, we collect a de-identified dataset of patient messages and physician responses from Mayo Clinic and generate alternative replies using ChatGPT. Our analyses incorporate novel empathy ranking evaluation (EMRank) involving both automated metrics and human assessments to gauge the empathy level of responses. Our findings indicate that LLM-powered chatbots have the potential to surpass human physicians in delivering empathetic communication, suggesting a promising avenue for enhancing patient care and reducing professional burnout. The study not only highlights the importance of empathy in patient interactions but also proposes a set of effective automatic empathy ranking metrics, paving the way for the broader adoption of LLMs in healthcare.

大型语言模型（LLMs）在医疗领域的整合潜在地可以通过开发具有共情能力，面向患者的聊天机器人，显著增强患者护理和支持。本研究调查了一个有趣的问题：相较于通常由医生提供的，ChatGPT能否提供更高程度的共情回应？为了回答这个问题，我们从梅奥诊所收集了病人信息和医生回复的去标识化数据集，并使用ChatGPT生成了备选回复。我们的分析包括了一种新的共情评级（EMRank）评估方法，评估回复的共情程度，该方法涵盖了自动化指标和人工评估。我们的研究结果表明，由LLM驱动的聊天机器人在传递共情沟通方面有超过人类医生的潜力，这为增强患者护理和减少专业倦怠提供了有前景的途径。本研究不仅强调了患者互动中共情的重要性，还提出了一套有效的自动共情评级指标，为LLM在医疗领域更广泛的应用铺平了道路。

使用真实世界的医生与患者互动评估大型语言模型的共情能力