BriefGPT.xyz
Mar, 2019
基于专家状态序列的混合强化学习
Hybrid Reinforcement Learning with Expert State Sequences
HTML
PDF
Xiaoxiao Guo, Shiyu Chang, Mo Yu, Gerald Tesauro, Murray Campbell
TL;DR
本文提出一种基于张量的模型,用于推断专家状态序列中未被观察到的动作,通过混合强化学习和模仿学习来优化智能体的策略,实证结果表明这种混合方法比一般的深度神经网络模型更具优势,并且在专家状态序列中表现出了抗扰动的特性。
Abstract
Existing
imitation learning
approaches often require that the complete demonstration data, including sequences of actions and states, are available. In this paper, we consider a more realistic and difficult scenario where a
→