BriefGPT.xyz
Sep, 2023
非定态强化学习中的节奏适应性
Tempo Adaption in Non-stationary Reinforcement Learning
HTML
PDF
Hyunin Lee, Yuhao Ding, Jongmin Lee, Ming Jin, Javad Lavaei...
TL;DR
我们提出了一个名为“ProST”的前瞻性节奏框架,用于解决非稳态强化学习中的时间同步问题,通过计算最优的交互时间,实现在不同环境变化速度下的政策优化。实验结果表明,ProST框架在高维度非稳态环境中获得了比现有方法更高的在线回报。
Abstract
We first raise and tackle ``
time synchronization
'' issue between the agent and the environment in
non-stationary reinforcement learning
(RL), a crucial factor hindering its real-world applications. In reality,
→