BriefGPT.xyz
Dec, 2020
不确定性感知策略优化:一种稳健、自适应的信任区域方法
Uncertainty-Aware Policy Optimization: A Robust, Adaptive Trust Region Approach
HTML
PDF
James Queeney, Ioannis Ch. Paschalidis, Christos G. Cassandras
TL;DR
在强化学习中,针对数据量有限的情况,提出了一种基于不确定性管理技术的深度策略优化方法,可以生成稳健的策略更新,适应学习过程中的不确定性水平。
Abstract
In order for
reinforcement learning
techniques to be useful in real-world decision making processes, they must be able to produce
robust performance
from
→