BriefGPT.xyz
Feb, 2024
基于效用的强化学习:统一单目标与多目标强化学习
Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning
HTML
PDF
Peter Vamplew, Cameron Foale, Conor F. Hayes, Patrick Mannion, Enda Howley...
TL;DR
通过引入基于效用的范式,将多目标强化学习的研究扩展到单目标强化学习领域,探讨了多策略学习、风险感知强化学习、折扣率以及安全强化学习等方面带来的潜在益处,并研究了采用基于效用的方法所带来的算法性能影响。
Abstract
Research in
multi-objective reinforcement learning
(MORL) has introduced the
utility-based paradigm
, which makes use of both environmental rewards and a function that defines the utility derived by the user from
→