BriefGPT.xyz
Oct, 2023
机器人技能学习的动作量化离线强化学习
Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning
HTML
PDF
Jianlan Luo, Perry Dong, Jeffrey Wu, Aviral Kumar, Xinyang Geng...
TL;DR
我们提出了一种自适应的行动量化方案,通过使用VQ-VAE学习状态条件的行动量化,避免了行动空间的指数爆炸问题,并通过离线强化学习方法在基准测试中改进了性能,同时在Robomimic环境中的复杂机器人操作任务中,离线强化学习算法通过离散化相对于连续方法实现了2-3倍的改进。
Abstract
The
offline reinforcement learning
(RL) paradigm provides a general recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data. While
policy constraints
→