BriefGPT.xyz
Feb, 2021
状态增强约束强化学习:克服通过奖励学习的限制
State Augmented Constrained Reinforcement Learning: Overcoming the Limitations of Learning with Rewards
HTML
PDF
Miguel Calvo-Fullana, Santiago Paternain, Luiz F. O. Chamon, Alejandro Ribeiro
TL;DR
通过在状态中增加Lagrange乘子并将原始-对偶方法重新解释为推动乘子演变的动态部分,本文提出了一种系统的状态增强过程,可确保解决具有约束的增强学习问题。
Abstract
constrained reinforcement learning
involves multiple rewards that must individually accumulate to given thresholds. In this class of problems, we show a simple example in which the desired
optimal policy
cannot b
→