BriefGPT.xyz
Feb, 2022
GrASP: 基于梯度的可供性选择规划
GrASP: Gradient-Based Affordance Selection for Planning
HTML
PDF
Vivek Veeriah, Zeyu Zheng, Richard Lewis, Satinder Singh
TL;DR
本篇论文主要探讨在大规模强化学习领域中,如何解决在使用树搜索规划时处理连续动作空间的问题,并通过学习选取能够有助于规划的可接受行为(Affordances),并以基于梯度下降的方法更新其参数,从而实现同时学习选取单元行为和规划带有学习后价值等价模型的方法优于无模型强化学习的目的。
Abstract
planning
with a learned model is arguably a key component of intelligence. There are several challenges in realizing such a component in large-scale
reinforcement learning
(RL) problems. One such challenge is dea
→