BriefGPT.xyz
Jun, 2016
利用高斯过程进行有限马尔可夫决策过程的安全探索
Safe Exploration in Finite Markov Decision Processes with Gaussian Processes
HTML
PDF
Matteo Turchetta, Felix Berkenkamp, Andreas Krause
TL;DR
本文提出针对有安全限制的探索问题的新型算法,使用高斯过程先验来表达未知安全限制,具有积极探索安全状态和行为、同时考虑到可达性并能够完全探索可达状态的能力。演示实验使用机器人探索数字地形模型。
Abstract
In classical
reinforcement learning
, when exploring an environment, agents accept arbitrary short term loss for long term gain. This is infeasible for
safety
critical applications, such as robotics, where even a
→