BriefGPT.xyz
Oct, 2023
建立分布鲁棒学习和离线强化学习的桥梁:缓解分布偏移和部分数据覆盖的方法
Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
HTML
PDF
Kishan Panaganti, Zaiyan Xu, Dileep Kalathil, Mohammad Ghavamzadeh
TL;DR
离线强化学习中的分布偏移问题可以通过分布鲁棒学习框架来解决,本文提出了两种使用该框架的离线强化学习算法,并通过模拟实验展示了其优越性能。
Abstract
The goal of an
offline reinforcement learning
(RL) algorithm is to learn optimal polices using historical (offline) data, without access to the environment for online exploration. One of the main challenges in offline RL is the
→