强化学习中多样化回放的泛化作用

Jun, 2023

The Role of Diverse Replay for Generalisation in Reinforcement Learning

Max Weltevrede, Matthijs T. J. Spaan, Wendelin Böhmer

TL;DR本研究通过理论和实证方法，探讨从不同角度对多任务强化学习的泛化性能进行提升，发现增加回放缓冲区中的转换的多样性有助于提高对训练期间“可达”和“不可达”状态的泛化能力和潜在表示的泛化能力。

Abstract

In reinforcement learning (RL), key components of many algorithms are the exploration strategy and replay buffer. These strategies regulat