连续动作领域的策略搜索：概述

Mar, 2018

Policy Search in Continuous Action Domains: an Overview

Olivier Sigaud, Freek Stulp

TL;DR本文综述了连续动作策略搜索的研究现状，包括深度强化学习算法、基于进化算法的竞争者、贝叶斯优化和定向探索方法等，提供了一种统一的视角，并探讨了各种方法的样本效率特性。

Abstract

continuous action policy search, the search for efficient policies in continuous control tasks, is currently the focus of intensive research driven both by the recent success of deep reinforcement learning algori