Large Language Models (LLMs) are trained on massive amounts of data, enabling their application across diverse domains and tasks. Despite their remarkable performance, most LLMs are developed and evaluated primarily in English. Recently, a few multi-lingual LLMs have emerged, but their performance in low-resource languages, especially the most spoken languages in South Asia, is less explored. To address this gap, in this study, we evaluate LLMs such as GPT-4, Llama 2, and Gemini to analyze their effectiveness in English compared to other low-resource languages from South Asia (e.g., Bangla, Hindi, and Urdu). Specifically, we utilized zero-shot prompting and five different prompt settings to extensively investigate the effectiveness of the LLMs in cross-lingual translated prompts. The findings of the study suggest that GPT-4 outperformed Llama 2 and Gemini in all five prompt settings and across all languages. Moreover, all three LLMs performed better for English language prompts than other low-resource language prompts. This study extensively investigates LLMs in low-resource language contexts to highlight the improvements required in LLMs and language-specific resources to develop more generally purposed NLP applications.

本研究针对大型语言模型（LLMs）主要在英语环境中评估的不足，特别是对南亚低资源语言的探讨进行了深入分析。研究采用了零-shot 提示和五种不同的提示设置，结果表明，GPT-4在各语言模式下均优于Llama 2和Gemini，且所有三种模型在英语提示下表现优于其他低资源语言提示。这一发现强调了针对低资源语言的LLMs改进需求，以推动更通用的自然语言处理应用的发展。

用英语询问更佳：对大型语言模型在英语、低资源和跨语言环境中的评估