There has been an increasing interest in 3D indoor navigation, where a robot in an environment moves to a target according to an instruction. To deploy a robot for navigation in the physical world, lots of training data is required to learn an effective policy. It is quite labour intensive to obtain sufficient real environment data for training robots while synthetic data is much easier to construct by render-ing. Though it is promising to utilize the synthetic environments to facilitate navigation training in the real world, real environment are heterogeneous from synthetic environment in two aspects. First, the visual representation of the two environments have significant variances. Second, the houseplans of these two environments are quite different. There-fore two types of information,i.e. visual representation and policy behavior, need to be adapted in the reinforce mentmodel. The learning procedure of visual representation and that of policy behavior are presumably reciprocal. We pro-pose to jointly adapt visual representation and policy behavior to leverage the mutual impacts of environment and policy. Specifically, our method employs an adversarial feature adaptation model for visual representation transfer anda policy mimic strategy for policy behavior imitation. Experiment shows that our method outperforms the baseline by 19.47% without any additional human annotations.

本文介绍了一种基于对抗特征调整模型的3D室内导航机器人训练方法，通过视觉特征的转换与行为策略的模仿来提高机器人在真实环境中的表现。实验证明该方法能够在不需要额外人工注释的情况下，比基线方法表现提高19.47%。

Sim-Real联合强化迁移学习在3D室内导航中的应用