BriefGPT.xyz
Oct, 2023
具有通用函数近似的反腐败离线强化学习
Corruption-Robust Offline Reinforcement Learning with General Function Approximation
HTML
PDF
Chenlu Ye, Rui Yang, Quanquan Gu, Tong Zhang
TL;DR
我们研究了离线强化学习中的腐败鲁棒性问题,提出了一种新的不确定性权重迭代方法来计算批处理样本,并设计了一种对腐败具有鲁棒性的离线强化学习算法。
Abstract
We investigate the problem of
corruption robustness
in
offline reinforcement learning
(RL) with general
function approximation
, where an a
→