Recent advances in large vision-language models (LVLMs) have revealed an \textit{overthinking} phenomenon, where models generate verbose reasoning across all tasks regardless of questions. To address this issue, we present \textbf{FAST}, a novel \textbf{Fa}st-\textbf{S}low \textbf{T}hinking framework that dynamically adapts reasoning depth based on question characteristics. Through empirical analysis, we establish the feasibility of fast-slow thinking in LVLMs by investigating how response length and data distribution affect performance. We develop FAST-GRPO with three components: model-based metrics for question characterization, an adaptive thinking reward mechanism, and difficulty-aware KL regularization. Experiments across seven reasoning benchmarks demonstrate that FAST achieves state-of-the-art accuracy with over 10\% relative improvement compared to the base model, while reducing token usage by 32.7-67.3\% compared to previous slow-thinking approaches, effectively balancing reasoning length and accuracy.

本研究解决了大规模视觉语言模型中的“过度思考”现象，提出了一种名为FAST的快速-缓慢思维框架，该框架根据问题特征动态调整推理深度。实验结果表明，FAST在七个推理基准上的准确性达到最新水平，相较于基础模型实现了超过10%的相对提升，并将标记使用量减少了32.7%至67.3%。

大规模视觉语言模型推理的快速-缓慢思维