BriefGPT.xyz
Jan, 2023
随机网络蒸馏的抗探索
Anti-Exploration by Random Network Distillation
HTML
PDF
Alexander Nikulin, Vladislav Kurenkov, Denis Tarasov, Sergey Kolesnikov
TL;DR
本文研究了随机网络提炼(RND) 在离线强化学习中作为不确定性评估器的应用,发现通过特定的调整可以达到有效优化的目的,并提出了一种基于FiLM的简单高效算法,其在D4RL基准测试中表现良好。
Abstract
Despite the success of
random network distillation
(RND) in various domains, it was shown as not discriminative enough to be used as an
uncertainty estimator
for penalizing out-of-distribution actions in
→