Reinforcement learning is an appropriate and successful method to robustly
perform low-level robot control under noisy conditions. Symbolic action
planning is useful to resolve causal dependencies and to break a causally
complex problem down into a sequence of simpler high-level actions. A problem
with the integration of both approaches is that action planning is based on
discrete high-level action- and state spaces, whereas reinforcement learning is
usually driven by a continuous reward function. However, recent advances in
reinforcement learning, specifically, universal value function approximators
and hindsight experience replay, have focused on goal-independent methods based
on sparse rewards. In this article, we build on these novel methods to
facilitate the integration of action planning with reinforcement learning by
exploiting the reward-sparsity as a bridge between the high-level and low-level
state- and control spaces. As a result, we demonstrate that the integrated
neuro-symbolic method is able to solve object manipulation problems that
involve tool use and non-trivial causal dependencies under noisy conditions,
exploiting both data and knowledge.

本文介绍了一种基于奖励稀疏性的桥梁方法，将符号行动计划和强化学习相结合，以解决在噪声条件下涉及工具使用和复杂因果依赖的对象操作问题，并成功利用了数据和知识。