使用次优控制器进行价值导向的探索，以学习灵巧操作

Mar, 2023

使用次优控制器进行价值导向的探索，以学习灵巧操作

Value Guided Exploration with Sub-optimal Controllers for Learning Dexterous Manipulation

Gagan Khandate, Cameron Mehlman, Xingsheng Wei, Matei Ciocarlie

TL;DR利用次优控制器优化采样和引导探索是提高学习手上熟练操作技能的有效的方法。本研究为首次，完成了在没有探索性复位分布的情况下，从高度次优的控制器中学习手指步态熟练操纵技能。

Abstract

Recently, reinforcement learning has allowed dexterous manipulation skills with increasing complexity. Nonetheless, learning these skills in simulation still exhibits poor sample-efficiency which stems from the fact these skills are learned from scratch without the benefit of any domai