具有前瞻信息的强化学习

Jun, 2024

Reinforcement Learning with Lookahead Information

Nadav Merlis

TL;DR通过利用先验信息，我们设计了一种能够有效地学习和处理未知环境中的强化学习问题的算法，大大提高了收集奖励的效率。

Abstract

We study reinforcement learning (RL) problems in which agents observe the reward or transition realizations at their current state before deciding which action to take. Such observations are available in many applications, including transactions, navigation and more. When the environme