Due to technological advances in the field of radio technology and its availability, the number of interference signals in the radio spectrum is continuously increasing. Interference signals must be detected in a timely fashion, in order to maintain standards and keep emergency frequencies open. To this end, specialized (multi-channel) receivers are used for spectrum monitoring. In this paper, the performances of two different approaches for controlling the available receiver resources are compared. The methods used for resource management (ReMa) are linear frequency tuning as a heuristic approach and a Q-learning algorithm from the field of reinforcement learning. To test the methods to be investigated, a simplified scenario was designed with two receiver channels monitoring ten non-overlapping frequency bands with non-uniform signal activity. For this setting, it is shown that the Q-learning algorithm used has a significantly higher detection rate than the heuristic approach at the expense of a smaller exploration rate. In particular, the Q-learning approach can be parameterized to allow for a suitable trade-off between detection and exploration rate.

本研究比较了线性频率调谐作为启发式方法和来自强化学习领域的 Q-learning 算法这两种不同方法在控制可用接收机资源方面。经过简化的情景测试表明，Q-learning 算法相对于启发式方法具有更高的检测率，同时可以通过参数化实现检测与探索率之间的权衡。

通过强化学习的实时频谱监测——Q学习和启发式方法的比较