BriefGPT.xyz
Feb, 2022
使用神经网络奖励函数的开放式强化学习
Open-Ended Reinforcement Learning with Neural Reward Functions
HTML
PDF
Robert Meier, Asier Mujika
TL;DR
该研究提出了一种使用神经网络编码奖励函数的方法,通过迭代训练,以鼓励更复杂的行为,实现在高维度机器人和像素级环境下的无监督学习,从而学习包括前空翻和单腿奔跑等丰富的技能。
Abstract
Inspired by the great success of
unsupervised learning
in Computer Vision and Natural Language Processing, the
reinforcement learning
community has recently started to focus more on unsupervised discovery of skil
→