BriefGPT.xyz
Oct, 2022
随机动作 vs 随机策略:基于模型的直接策略搜索的引导
Random Actions vs Random Policies: Bootstrapping Model-Based Direct Policy Search
HTML
PDF
Elias Hanna, Alex Coninx, Stéphane Doncieux
TL;DR
本文研究了初始数据收集方法对动态模型学习的影响,并比较了两个文献中使用的初始化方法,结果表明任务依赖因素可能对每种方法都有害,建议探索混合方法。
Abstract
This paper studies the impact of the initial data gathering method on the subsequent learning of a dynamics model.
dynamics models
approximate the true
transition function
of a given task, in order to perform
→