从单次演示中学习 Montezuma's Revenge

Dec, 2018

Learning Montezuma's Revenge from a Single Demonstration

Tim Salimans, Richard Chen

TL;DR提出了一种新的利用单一示范来学习解决Montezuma's Revenge等复杂探索任务的方法，该方法通过最大化奖励来训练代理，缩短了学习时间，降低了任务复杂度。

Abstract

We propose a new method for learning from a single demonstration to solve hard exploration tasks like the Atari game montezuma's revenge.