BriefGPT.xyz
May, 2023
在球面上无监督地发现连续技能
Unsupervised Discovery of Continuous Skills on a Sphere
HTML
PDF
Takahisa Imagawa, Takuya Hiraoka, Yoshimasa Tsuruoka
TL;DR
本文提出了一种称为DISCS的学习方法,通过最大化技能和状态间的互信息,学习一种可能的无数不同技能,其中每一个技能对应于球面上的连续值,并且通过在MuJoCo Ant机器人控制环境中的实验显示,DISCS可以比其他方法学习到更多元化的技能。
Abstract
Recently, methods for learning
diverse skills
to generate various behaviors without external rewards have been actively studied as a form of
unsupervised reinforcement learning
. However, most of the existing meth
→