"An idea is nothing more nor less than a new combination of old elements" (Young, J.W.). The widespread adoption of Large Language Models (LLMs) and publicly available ChatGPT have marked a significant turning point in the integration of Artificial Intelligence (AI) into people's everyday lives. This study explores the capability of LLMs in generating novel research ideas based on information from research papers. We conduct a thorough examination of 4 LLMs in five domains (e.g., Chemistry, Computer, Economics, Medical, and Physics). We found that the future research ideas generated by Claude-2 and GPT-4 are more aligned with the author's perspective than GPT-3.5 and Gemini. We also found that Claude-2 generates more diverse future research ideas than GPT-4, GPT-3.5, and Gemini 1.0. We further performed a human evaluation of the novelty, relevancy, and feasibility of the generated future research ideas. This investigation offers insights into the evolving role of LLMs in idea generation, highlighting both its capability and limitations. Our work contributes to the ongoing efforts in evaluating and utilizing language models for generating future research ideas. We make our datasets and codes publicly available.

本研究探讨了大型语言模型（LLMs）在基于研究论文信息生成新的研究想法的能力，填补了人工智能应用于科研创意生成的空白。我们发现，Claude-2产生的未来研究想法在多样性和作者观点的契合度上优于其他模型，如GPT-4和GPT-3.5。此研究突显了LLMs在创意生成过程中的潜力和局限性，为未来利用语言模型生成研究想法提供了重要洞见。

大型语言模型能否激发新的科学研究想法？