Demonstrations are widely used in Deep Reinforcement Learning (DRL) for facilitating solving tasks with sparse rewards. However, the tasks in real-world scenarios can often have varied initial conditions from the demonstration, which would require additional prior behaviours. For example, consider we are given the demonstration for the task of \emph{picking up an object from an open drawer}, but the drawer is closed in the training. Without acquiring the prior behaviours of opening the drawer, the robot is unlikely to solve the task. To address this, in this paper we propose an Intrinsic Rewards Driven Example-based Control \textbf{(IRDEC)}. Our method can endow agents with the ability to explore and acquire the required prior behaviours and then connect to the task-specific behaviours in the demonstration to solve sparse-reward tasks without requiring additional demonstration of the prior behaviours. The performance of our method outperforms other baselines on three navigation tasks and one robotic manipulation task with sparse rewards. Codes are available at https://github.com/Ricky-Zhu/IRDEC.

通过在Demonstrations领域中应用深度强化学习（DRL），我们提出了一种基于内在奖励驱动的示例控制方法（IRDEC），该方法使代理能够探索和获取所需的先前行为，然后与示例中的任务特定行为相连接，无需额外演示先前行为即可解决稀疏奖励任务。我们的方法在三个导航任务和一个机器人操纵任务中表现优于其他基准方法。

学习利用先前行为解决任务