随机数值函数的深度探索

Mar, 2017

Deep Exploration via Randomized Value Functions

Ian Osband, Daniel Russo, Zheng Wen, Benjamin Van Roy

TL;DR本研究探讨了随机价值函数在强化学习中引导深度探索的使用，证明了其在合成统计上和计算效率上与常见的实用价值函数学习方法的探索的优越性，并通过计算实验证明了其有效性，并证明了在表格表示下的统计效率的遗憾界（regret bound）

Abstract

We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synth