BriefGPT.xyz
Jul, 2022
运用变分因果推理泛化目标条件强化学习
Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning
HTML
PDF
Wenhao Ding, Haohong Lin, Bo Li, Ding Zhao
TL;DR
利用Causal Graph加强了Goal-Conditioned RL,提出了一种理论性能保证的优化框架,包括因果性发现、转换建模和策略训练的循环以提高RL代理的推理和泛化能力,并在九种任务上与五个基线进行了实证效果验证。
Abstract
As a pivotal component to attaining generalizable solutions in human intelligence,
reasoning
provides great potential for
reinforcement learning
(RL) agents'
→