BriefGPT.xyz
May, 2023
行为对比学习:无监督技能发现
Behavior Contrastive Learning for Unsupervised Skill Discovery
HTML
PDF
Rushuai Yang, Chenjia Bai, Hongyi Guo, Siyuan Li, Bin Zhao...
TL;DR
本文提出了一种基于对比学习的无监督技能发现方法,通过得到相似的行为来表征同一种技能, 并使得不同技能产生不同的行为,同时增加状态熵以获得更好的状态覆盖率,实验结果表明,该方法能够产生各种远程技能,并在下游任务中达到有竞争力的表现。
Abstract
In
reinforcement learning
,
unsupervised skill discovery
aims to learn diverse skills without extrinsic rewards. Previous methods discover skills by maximizing the mutual information (MI) between states and skills
→