We consider a variant of continuous-state partially-observable stochastic games with neural perception mechanisms and an asymmetric information structure. One agent has partial information, with the observation function implemented as a neural network, while the other agent is assumed to have full knowledge of the state. We present, for the first time, an efficient online method to compute an $\varepsilon$-minimax strategy profile, which requires only one linear program to be solved for each agent at every stage, instead of a complex estimation of opponent counterfactual values. For the partially-informed agent, we propose a continual resolving approach which uses lower bounds, pre-computed offline with heuristic search value iteration (HSVI), instead of opponent counterfactual values. This inherits the soundness of continual resolving at the cost of pre-computing the bound. For the fully-informed agent, we propose an inferred-belief strategy, where the agent maintains an inferred belief about the belief of the partially-informed agent based on (offline) upper bounds from HSVI, guaranteeing $\varepsilon$-distance to the value of the game at the initial belief known to both agents.

我们提出了一种变种的连续状态部分可观测的随机博弈模型，其中包含神经感知机制和不对称信息结构。我们首次提出了一种高效的在线计算ε-极小极大策略配置的方法，每个阶段仅需解决一个线性规划问题，而不是复杂的对手反事实值估计。对于部分知情的智能体，我们提出了一种持续解决方法，使用由启发式搜索值迭代（HSVl）预先计算的下界代替对手反事实值。对于完全知情的智能体，我们提出了一种推断信念策略，该智能体基于HSVl的（离线）上界维护对部分知情智能体信念的推断，从而保证到初始信念上两智能体已知的游戏价值的ε-距离。

基于HSVI的部分观测随机博弈的在线极小化策略与神经感知机制