异质行动空间中的强化模仿

Apr, 2019

Reinforced Imitation in Heterogeneous Action Space

Konrad Zolna, Negar Rostamzadeh, Yoshua Bengio, Sungjin Ahn, Pedro O. Pinheiro

TL;DR本文提出了一种逐步平衡模仿学习成本和强化学习目标的方法，使得机器人能够利用稀疏奖励函数来优化其动作，以在导航场景等方面表现出更好的性能。

Abstract

imitation learning is an effective alternative approach to learn a policy when the reward function is sparse. In this paper, we consider a challenging setting where an agent and an expert use different actions from each other. We assume that the agent has access to a →