BriefGPT.xyz
May, 2024
噪声与不确定环境中的深度强化学习奖励机制
Reward Machines for Deep RL in Noisy and Uncertain Environments
HTML
PDF
Andrew C. Li, Zizhao Chen, Toryn Q. Klassen, Pashootan Vaezipoor, Rodrigo Toro Icarte...
TL;DR
用于嘈杂和不确定环境下的深度强化学习中,通过对任务结构进行利用,我们提出了一套RL算法,成功地提高了在词汇嘈杂的环境下的性能,从而为在部分可观察环境中利用Reward Machines提供了一个通用的框架。
Abstract
reward machines
provide an automata-inspired structure for specifying instructions, safety constraints, and other
temporally extended reward-worthy behaviour
. By exposing complex reward function structure, they e
→