Q学习是一个不适定问题吗？

Feb, 2025

Is Q-learning an Ill-posed Problem?

Philipp Wissmann, Daniel Hein, Steffen Udluft, Thomas Runkler

TL;DR本研究解决了Q学习在连续环境中不稳定性的问题，这一挑战经常困扰实施者。通过系统性地检验引导学习和模型不准确性的影响，研究发现即使在相对简单的基准测试中，Q学习的基础任务也可能固有地不适定且容易失败，这对Q学习作为强化学习通用解决方案的可靠性提出了质疑。

Abstract

This paper investigates the instability of Q-learning in continuous environments, a challenge frequently encountered by practitioners. Traditionally, this →