通过行为持久性在批强化学习中控制频率自适应

Feb, 2020

通过行为持久性在批强化学习中控制频率自适应

Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning

Alberto Maria Metelli, Flavio Mazzolini, Lorenzo Bisi, Luca Sabbioni, Marcello Restelli

TL;DR本文介绍了一种基于动作重复的新算法 PFQI，旨在增强强化学习算法的性能，在理论上和实验中得到验证。

Abstract

The choice of the control frequency of a system has a relevant impact on the ability of reinforcement learning algorithms to learn a highly performing policy. In this paper, we introduce the notion of