Large-scale deployment of large language models (LLMs) in various applications, such as chatbots and virtual assistants, requires LLMs to be culturally sensitive to the user to ensure inclusivity. Culture has been widely studied in psychology and anthropology, and there has been a recent surge in research on making LLMs more culturally inclusive in LLMs that goes beyond multilinguality and builds on findings from psychology and anthropology. In this paper, we survey efforts towards incorporating cultural awareness into text-based and multimodal LLMs. We start by defining cultural awareness in LLMs, taking the definitions of culture from anthropology and psychology as a point of departure. We then examine methodologies adopted for creating cross-cultural datasets, strategies for cultural inclusion in downstream tasks, and methodologies that have been used for benchmarking cultural awareness in LLMs. Further, we discuss the ethical implications of cultural alignment, the role of Human-Computer Interaction in driving cultural inclusion in LLMs, and the role of cultural alignment in driving social science research. We finally provide pointers to future research based on our findings about gaps in the literature.

本研究针对大规模语言模型在应用中对用户文化敏感性的需求进行了调查，指出了当前文献中的不足之处。通过提出跨文化数据集的创建方法和文化纳入策略，本论文总结了在文本和多模态语言模型中增强文化意识的最新进展。研究发现，推动文化对齐不仅能提高语言模型的包容性，还有助于社会科学研究的发展。

语言模型中的文化意识调查：文本及其他