BriefGPT.xyz
Oct, 2019
通过阅读实现在新环境动态下的泛化
RTFM: Generalising to Novel Environment Dynamics via Reading
HTML
PDF
Victor Zhong, Tim Rocktäschel, Edward Grefenstette
TL;DR
通过阅读策略学习器实现语言理解,是实现强化学习中泛化到新环境的有前途的方法。本文提出了一个基于文本阅读的策略学习问题,通过程序生成环境动态并相应地生成动态的语言描述。通过课程学习,我们的模型能够在需要几个推理和指代步骤的复杂任务上提供出色的策略。
Abstract
Obtaining policies that can generalise to new environments in
reinforcement learning
is challenging. In this work, we demonstrate that
language understanding
via a reading policy learner is a promising vehicle fo
→