In many real-world decision making problems, reaching an optimal decision requires taking into account a variable number of objects around the agent. Autonomous driving is a domain in which this is especially relevant, since the number of cars surrounding the agent varies considerably over time and affects the optimal action to be taken. Classical methods that process object lists can deal with this requirement. However, to take advantage of recent high-performing methods based on deep reinforcement learning in modular pipelines, special architectures are necessary. For these, a number of options exist, but a thorough comparison of the different possibilities is missing. In this paper, we elaborate limitations of fully-connected neural networks and other established approaches like convolutional and recurrent neural networks in the context of reinforcement learning problems that have to deal with variable sized inputs. We employ the structure of Deep Sets in off-policy reinforcement learning for high-level decision making, highlight their capabilities to alleviate these limitations, and show that Deep Sets not only yield the best overall performance but also offer better generalization to unseen situations than the other approaches.

本文阐述了全连接神经网络，卷积神经网络和递归神经网络在处理变量大小输入的强化学习问题方面的局限性，提出了一种利用Deep Sets结构的离线决策方法，用于高层次决策，通过比较各种不同的可能性，表明Deep Sets不仅在总体表现上表现优异，而且在未见情况下呈现更好的泛化性。

自动驾驶中的深度强化学习动态输入