Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi...
TL;DR以非模式的方式提供了《ӏ–正则》监控下的最终快模满觉目标。
Abstract
We provide the first solution for model-free reinforcement learning of
{\omega}-regular objectives for markov decision processes (MDPs). We present a
constructive reduction from the almost-sure satisfaction of {\