BriefGPT.xyz
Jul, 2024
ROLeR: 离线强化学习中的有效奖励塑形在推荐系统中的应用
ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems
HTML
PDF
Yi Zhang, Ruihong Qiu, Jiajun Liu, Sen Wang
TL;DR
通过在线推荐系统中非参数奖励塑造方法和更具代表性的不确定性惩罚设计,提出了一种新颖的基于模型的离线强化学习方法,ROLeR,用于推荐系统中的奖励和不确定性估计,并通过四个基准数据集上的广泛实验验证了其在性能方面的表现。
Abstract
offline reinforcement learning
(RL) is an effective tool for real-world
recommender systems
with its capacity to model the dynamic interest of users and its interactive nature. Most existing offline RL
→