BriefGPT.xyz
Mar, 2021
互信息状态内在控制
Mutual Information State Intrinsic Control
HTML
PDF
Rui Zhao, Yang Gao, Pieter Abbeel, Volker Tresp, Wei Xu
TL;DR
本文提出了一种基于 Intrinsic motivation 的强化学习方法,其奖励函数被定义为智能体状态与周围状态之间的互信息,实现了比以前的方法更好的效果,包括在没有任何任务奖励的情况下首次完成了 pick-and-place 任务。
Abstract
reinforcement learning
has been shown to be highly successful at many challenging tasks. However, success heavily relies on well-shaped rewards.
intrinsically motivated rl
attempts to remove this constraint by de
→