BriefGPT.xyz
Oct, 2020
奖励机器:在强化学习中利用奖励函数结构
Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning
HTML
PDF
Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, Sheila A. McIlraith
TL;DR
该论文介绍了如何使用奖励机制来支持强化学习中的学习过程,并探讨了如何通过奖励机制结构的利用,来提高样本利用率和最终策略的质量。
Abstract
reinforcement learning
(RL) methods usually treat reward functions as black boxes. As such, these methods must extensively interact with the environment in order to discover rewards and
optimal policies
. In most
→