The glyphic writing system of Chinese incorporates information-rich visual features in each character, such as radicals that provide hints about meaning or pronunciation. However, there has been no investigation into whether contemporary Large Language Models (LLMs) and Vision-Language Models (VLMs) can harness these sub-character features in Chinese through prompting. In this study, we establish a benchmark to evaluate LLMs' and VLMs' understanding of visual elements in Chinese characters, including radicals, composition structures, strokes, and stroke counts. Our results reveal that models surprisingly exhibit some, but still limited, knowledge of the visual information, regardless of whether images of characters are provided. To incite models' ability to use radicals, we further experiment with incorporating radicals into the prompts for Chinese language understanding tasks. We observe consistent improvement in Part-Of-Speech tagging when providing additional information about radicals, suggesting the potential to enhance CLP by integrating sub-character information.

本研究探讨了当代大型语言模型（LLMs）和视觉语言模型（VLMs）在汉字中识别和利用偏旁等视觉信息的能力，填补了这一领域的研究空白。通过建立基准，我们发现模型在理解汉字的视觉元素方面存在一定知识，但仍然有限，并且在将偏旁信息融入提示时，模型在词性标注任务中的表现显著提升，显示了整合子字符信息的潜力。

汉字视觉信息的影响：评估大型模型识别和利用偏旁的能力