BriefGPT.xyz
Jan, 2023
学习,快与慢:面向动态环境的目标导向基于记忆的方法
Learning, Fast and Slow: A Goal-Directed Memory-Based Approach for Dynamic Environments
HTML
PDF
Tan Chong Min John, Mehul Motani
TL;DR
本研究针对基于模型的下一状态预测和状态价值预测收敛缓慢的问题,使用并行内存检索系统进行基于模型的规划,使用神经网络指导代理的行为,通过基于目标的探索在线训练,取得了92%的解决率,显示出 RL 模型应用于目标和子目标规划的未来。
Abstract
Model-based next state prediction and state value prediction are slow to converge. To address these challenges, we do the following: i) Instead of a
neural network
, we do
model-based planning
using a parallel mem
→