BriefGPT.xyz
Jan, 2019
元学习中的奖励塑形
Reward Shaping via Meta-Learning
HTML
PDF
Haosheng Zou, Tongzheng Ren, Dong Yan, Hang Su, Jun Zhu
TL;DR
本文提供了一种基于分布任务的meta-learning框架,自动学习新采样任务上的有效奖励塑形,从而解决了强化学习中信用分配的难题,并通过从DQN到DDPG的成功转移等各种设置,展示了探索 shaping 方法的有效性。
Abstract
reward shaping
is one of the most effective methods to tackle the crucial yet challenging problem of
credit assignment
in Reinforcement Learning (
→