BriefGPT.xyz
Feb, 2022
BADDr: 基于贝叶斯适应性的深度Dropout RL用于POMDPs
BADDr: Bayes-Adaptive Deep Dropout RL for POMDPs
HTML
PDF
Sammie Katt, Hai Nguyen, Frans A. Oliehoek, Christopher Amato
TL;DR
本文提出了一种表示无关的、针对部分可观测情况下的贝叶斯强化学习的理论框架,并提出了一种基于dropout网络的新方法BADDr,旨在解决BRL方法在拓展性上存在的瓶颈,并证实其在处理规模较大的情况时的有效性。
Abstract
While
reinforcement learning
(RL) has made great advances in scalability, exploration and
partial observability
are still active research topics. In contrast,
→