Large language models (LLMs) are revolutionizing various fields by leveraging large text corpora for context-aware intelligence. Due to the context size, however, encoding an entire graph with LLMs is fundamentally limited. This paper explores how to better integrate graph data with LLMs and presents a novel approach using various encoding modalities (e.g., text, image, and motif) and approximation of global connectivity of a graph using different prompting methods to enhance LLMs' effectiveness in handling complex graph structures. The study also introduces GraphTMI, a new benchmark for evaluating LLMs in graph structure analysis, focusing on factors such as homophily, motif presence, and graph difficulty. Key findings reveal that image modality, supported by advanced vision-language models like GPT-4V, is more effective than text in managing token limits while retaining critical information. The research also examines the influence of different factors on each encoding modality's performance. This study highlights the current limitations and charts future directions for LLMs in graph understanding and reasoning tasks.

本研究探讨了如何更好地将图形数据与大型语言模型（LLMs）整合，并提出了一种使用各种编码模态（例如文本、图像和模体）和使用不同提示方法来增强LLMs在处理复杂图形结构方面的有效性的新方法。研究还介绍了GraphTMI，这是一个用于评估LLMs在图结构分析方面的新基准，重点关注同质性、模体存在和图形难度等因素。关键发现揭示了图像模态在限制令牌的同时保留关键信息方面的更高效性，而且支持由GPT-4V等先进视觉语言模型。该研究还考察了不同因素对每种编码模态性能的影响。此研究强调了当前LLMs在图形理解和推理任务中的现有限制，并规划了未来的方向。

我应该使用哪种模式-文字、图案或图片？：理解大型语言模型中的图表