BriefGPT.xyz
Feb, 2024
常识奖励的多任务逆强化学习
Multi Task Inverse Reinforcement Learning for Common Sense Reward
HTML
PDF
Neta Glazer, Aviv Navon, Aviv Shamsian, Ethan Fetaya
TL;DR
通过将奖励分解为两个不同的部分,即任务特定奖励和常识奖励,并探讨后者如何从专家示范中学习,我们解决了在复杂真实环境中应用强化学习所存在的奖励功能不准确所导致的问题,并证明通过多任务逆向强化学习能够学习到一个有用的奖励函数。
Abstract
One of the challenges in applying
reinforcement learning
in a complex real-world environment lies in providing the agent with a sufficiently detailed
reward function
. Any misalignment between the reward and the d
→