Data augmentation creates new data points by transforming the original ones for a reinforcement learning (RL) agent to learn from, which has been shown to be effective for the objective of improving the data efficiency of RL for continuous control. Prior work towards this objective has been largely restricted to perturbation-based data augmentation where new data points are created by perturbing the original ones, which has been impressively effective for tasks where the RL agent observes control states as images with perturbations including random cropping, shifting, etc. This work focuses on state-based control, where the RL agent can directly observe raw kinematic and task features, and considers an alternative data augmentation applied to these features based on Euclidean symmetries under transformations like rotations. We show that the default state features used in exiting benchmark tasks that are based on joint configurations are not amenable to Euclidean transformations. We therefore advocate using state features based on configurations of the limbs (i.e., the rigid bodies connected by the joints) that instead provide rich augmented data under Euclidean transformations. With minimal hyperparameter tuning, we show this new Euclidean data augmentation strategy significantly improves both data efficiency and asymptotic performance of RL on a wide range of continuous control tasks.

本研究解决了强化学习在连续控制中的数据效率问题，尤其是在状态驱动的控制场景下。作者提出了一种基于欧几里得对称性的创新数据增强方法，该方法通过对肢体配置特征进行变换，显著提高了数据效率和最终性能。研究显示，这种新方法在多种连续控制任务中表现优异，具有重要的应用潜力。

基于欧几里得数据增强的强化学习在状态驱动的连续控制中的应用