This paper examines the comparative effectiveness of a specialized compiled language model and a general-purpose model like OpenAI's GPT-3.5 in detecting SDGs within text data. It presents a critical review of Large Language Models (LLMs), addressing challenges related to bias and sensitivity. The necessity of specialized training for precise, unbiased analysis is underlined. A case study using a company descriptions dataset offers insight into the differences between the GPT-3.5 and the specialized SDG detection model. While GPT-3.5 boasts broader coverage, it may identify SDGs with limited relevance to the companies' activities. In contrast, the specialized model zeroes in on highly pertinent SDGs. The importance of thoughtful model selection is emphasized, taking into account task requirements, cost, complexity, and transparency. Despite the versatility of LLMs, the use of specialized models is suggested for tasks demanding precision and accuracy. The study concludes by encouraging further research to find a balance between the capabilities of LLMs and the need for domain-specific expertise and interpretability.

该研究比较了一个专门编译的语言模型和通用模型（如OpenAI的GPT-3.5）在检测文本数据中可持续发展目标（SDGs）方面的效果。通过对大型语言模型（LLMs）进行关键性回顾，探讨了与偏见和敏感性相关的挑战。强调了需要专门的训练来进行准确、无偏的分析。使用公司描述数据集的案例研究揭示了GPT-3.5和专门的SDG检测模型之间的差异。虽然GPT-3.5具有更广泛的覆盖范围，但可能针对公司活动的相关性有限地识别出SDGs。相反，专门的模型更专注于高度相关的SDGs。强调了深思熟虑的模型选择的重要性，考虑任务需求、成本、复杂性和透明度。尽管LLMs非常灵活，但建议在需要精确性和准确性的任务中使用专门的模型。研究最后鼓励进一步研究在LLMs的能力与领域特定专业知识和可解释性之间找到平衡。

大型语言模型的关键评述：敏感性、偏见和专用人工智能之路