BriefGPT.xyz
Jun, 2023
离线强化学习中的自动折衷适应
Automatic Trade-off Adaptation in Offline RL
HTML
PDF
Phillip Swazinna, Steffen Udluft, Thomas Runkler
TL;DR
本文提出一种改进的离线强化学习算法 - AutoLION,该算法可以在运行时自适应地调整策略行为,利用自动驾驶寻找正确的权衡参数来平衡保守性和性能优化。
Abstract
Recently,
offline rl algorithms
have been proposed that remain adaptive at runtime. For example, the
lion algorithm
\cite{lion} provides the user with an interface to set the trade-off between behavior cloning an
→