An open problem in autonomous vehicle safety validation is building reliable
models of human driving behavior in simulation. This work presents an approach
to learn neural driving policies from real world driving demonstration data. We
model human driving as a sequential decision making problem that is
characterized by non-linearity and stochasticity, and unknown underlying cost
functions. Imitation learning is an approach for generating intelligent
behavior when the cost function is unknown or difficult to specify. Building
upon work in inverse reinforcement learning (IRL), Generative Adversarial
Imitation Learning (GAIL) aims to provide effective imitation even for problems
with large or continuous state and action spaces, such as modeling human
driving. This article describes the use of GAIL for learning-based driver
modeling. Because driver modeling is inherently a multi-agent problem, where
the interaction between agents needs to be modeled, this paper describes a
parameter-sharing extension of GAIL called PS-GAIL to tackle multi-agent driver
modeling. In addition, GAIL is domain agnostic, making it difficult to encode
specific knowledge relevant to driving in the learning process. This paper
describes Reward Augmented Imitation Learning (RAIL), which modifies the reward
signal to provide domain-specific knowledge to the agent. Finally, human
demonstrations are dependent upon latent factors that may not be captured by
GAIL. This paper describes Burn-InfoGAIL, which allows for disentanglement of
latent variability in demonstrations. Imitation learning experiments are
performed using NGSIM, a real-world highway driving dataset. Experiments show
that these modifications to GAIL can successfully model highway driving
behavior, accurately replicating human demonstrations and generating realistic,
emergent behavior in the traffic flow arising from the interaction between
driving agents.

本文提出一种通过学习真实世界中的驾驶示范数据来学习神经驾驶策略的方法，并使用广义博弈对抗模型进行智能驾驶行为生成，同时介绍了解决多智能体驾驶建模中存在的问题的多智能体模型，并描述了奖赏信号修正的 reward augmented imitation learning (RAIL) 和 Burn-InfoGAIL 对潜在变化因素进行解耦的方法。在 NGSIM 实验数据集上，成功地模拟了高速公路上的驾驶行为。