Sim-and-real training is a promising alternative to sim-to-real training for robot manipulations. However, the current sim-and-real training is neither efficient, i.e., slow convergence to the optimal policy, nor effective, i.e., sizeable real-world robot data. Given limited time and hardware budgets, the performance of sim-and-real training is not satisfactory. In this paper, we propose a Consensus-based Sim-And-Real deep reinforcement learning algorithm (CSAR) for manipulator pick-and-place tasks, which shows comparable performance in both sim-and-real worlds. In this algorithm, we train the agents in simulators and the real world to get the optimal policies for both sim-and-real worlds. We found two interesting phenomenons: (1) Best policy in simulation is not the best for sim-and-real training. (2) The more simulation agents, the better sim-and-real training. The experimental video is available at: https://youtu.be/mcHJtNIsTEQ.

本文提出了一个基于深度强化学习的机器人pick-and-place任务的共识型模拟现实联合训练算法（CSAR），目的是在模拟和实际环境中都实现高效和有效的策略优化。实验表明，模拟中的最佳策略并不一定适用于模拟和实际环境的学习。同时，越多的模拟代理越有利于模拟现实的训练。

基于模拟与实际强化学习的机器人操作：一种基于一致性的方法