Large Language Models (LLMs) have made significant progress in incorporating Indic languages within multilingual models. However, it is crucial to quantitatively assess whether these languages perform comparably to globally dominant ones, such as English. Currently, there is a lack of benchmark datasets specifically designed to evaluate the regional knowledge of LLMs in various Indic languages. In this paper, we present the L3Cube-IndicQuest, a gold-standard question-answering benchmark dataset designed to evaluate how well multilingual LLMs capture regional knowledge across various Indic languages. The dataset contains 200 question-answer pairs, each for English and 19 Indic languages, covering five domains specific to the Indic region. We aim for this dataset to serve as a benchmark, providing ground truth for evaluating the performance of LLMs in understanding and representing knowledge relevant to the Indian context. The IndicQuest can be used for both reference-based evaluation and LLM-as-a-judge evaluation. The dataset is shared publicly at https://github.com/l3cube-pune/indic-nlp .

本研究解决了评估大型语言模型（LLMs）在印地语区知识掌握能力的缺乏基准数据集的问题。提出的L3Cube-IndicQuest数据集包含200个涵盖19种印地语言的问答对，旨在量化评估多语言LLMs在理解和呈现印度特定知识方面的表现。此数据集的发布将为相关领域的研究提供标准参照，促进LLMs的进一步发展。

L3Cube-IndicQuest：评估大型语言模型在印度背景下知识的问答基准数据集