The paper presents a new method for approximating Strong Stackelberg Equilibrium in general-sum sequential games with imperfect information and perfect recall. The proposed approach is generic as it does not rely on any specific properties of a particular game model. The method is based on iterative interleaving of the two following phases: (1) guided Monte Carlo Tree Search sampling of the Follower's strategy space and (2) building the Leader's behavior strategy tree for which the sampled Follower's strategy is an optimal response. The above solution scheme is evaluated with respect to expected Leader's utility and time requirements on three sets of interception games with variable characteristics, played on graphs. A comparison with three state-of-the-art MILP/LP-based methods shows that in vast majority of test cases proposed simulation-based approach leads to optimal Leader's strategies, while excelling the competitive methods in terms of better time scalability and lower memory requirements.

本论文提出了一种新的方法来近似求解弱Stackelberg均衡，方法基于Follower策略空间的引导式Monte Carlo树搜索和Leader的行为策略树建立，并在用于三个不同拓扑结构的博弈测试中取得了优异的效果，较传统方法更具实用性和时间可扩展性。

一种用于广义和可扩展博弈中的双极预测采样算法近似求解斯塔克尔伯格均衡