As the utilization of large language models (LLMs) has proliferated worldwide, it is crucial for them to have adequate knowledge and fair representation for diverse global cultures. In this work, we uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations, and extract symbols from these generations that are associated to each culture by the LLM. We discover that culture-conditioned generation consist of linguistic "markers" that distinguish marginalized cultures apart from default cultures. We also discover that LLMs have an uneven degree of diversity in the culture symbols, and that cultures from different geographic regions have different presence in LLMs' culture-agnostic generation. Our findings promote further research in studying the knowledge and fairness of global culture perception in LLMs. Code and Data can be found in: https://github.com/huihanlhh/Culture-Gen/

通过对110个国家和地区的8个与文化有关的主题的文化条件生成，以及从这些生成中提取与每个文化相关的符号，我们发现文化条件生成由区分边缘文化与默认文化的语言“标记”组成，而且发现LLM在文化符号的多样性方面存在不平衡，并且来自不同地理区域的文化在LLM的文化无关生成中存在不同的存在。我们的发现促进了进一步研究LLM中全球文化知识和公平感知的研究。

CULTURE-GEN: 透过自然语言提示揭示语言模型中的全球文化认知