BriefGPT.xyz
Oct, 2023
演示调整的强化学习
Demonstration-Regularized RL
HTML
PDF
Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Alexey Naumov...
TL;DR
利用专家演示来改善强化学习的样本效率,本研究量化了额外信息在降低样本复杂度方面的理论效果,并证明了KL-正则化方法在处理人类反馈强化学习中的优势。
Abstract
Incorporating
expert demonstrations
has empirically helped to improve the sample efficiency of
reinforcement learning
(RL). This paper quantifies theoretically to what extent this extra information reduces RL's <
→