BriefGPT.xyz
Jul, 2019
统一的贝尔曼最优性原理:整合奖励最大化和能量增强
A Unified Bellman Optimality Principle Combining Reward Maximization and Empowerment
HTML
PDF
Felix Leibfried, Sergio Pascual-Diaz, Jordi Grau-Moya
TL;DR
本文研究了内在激励方法——授权在外在奖励信号的增强学习中的应用,提出了一个授权奖励最大化的统一Bellman最优性原则,发展了基于授权的演员-评论家强化学习算法,并在高维连续机器人领域验证了其性能优于现有技术。
Abstract
empowerment
is an
information-theoretic
method that can be used to intrinsically motivate learning agents. It attempts to maximize an agent's control over the environment by encouraging visiting states with a lar
→