基于临界性的强化学习变步长算法

Jan, 2022

基于临界性的强化学习变步长算法

Criticality-Based Varying Step-Number Algorithm for Reinforcement Learning

Yitzhak Spielberg, Amos Azaria

TL;DR介绍了一种基于关键度量的步长算法，利用人工提供或从环境中自学习的关键性函数，测试表明其优于深度 Q 学习和 Monte Carlo 等流行学习算法，适用于 Atari Pong、Road-Tree 和射击游戏等多个领域。

Abstract

In the context of reinforcement learning we introduce the concept of criticality of a state, which indicates the extent to which the choice of action in that particular state influences the expected return. That