BriefGPT.xyz
Jan, 2024
实践中学习:非稳态马尔可夫决策过程中的自适应决策
Act as You Learn: Adaptive Decision-Making in Non-Stationary Markov Decision Processes
HTML
PDF
Baiting Luo, Yunuo Zhang, Abhishek Dubey, Ayan Mukhopadhyay
TL;DR
在处理非平稳环境的序贯决策问题中,我们提出了一种自适应蒙特卡洛树搜索算法,通过学习环境的更新动态来改进决策过程,减少过分悲观的行为并提高决策速度。
Abstract
A fundamental (and largely open) challenge in sequential
decision-making
is dealing with
non-stationary environments
, where exogenous environmental conditions change over time. Such problems are traditionally mod
→