Despite the potential of active inference for visual-based control, learning the model and the preferences (priors) while interacting with the environment is challenging. Here, we study the performance of a deep active inference (dAIF) agent on OpenAI's car racing benchmark, where there is no access to the car's state. The agent learns to encode the world's state from high-dimensional input through unsupervised representation learning. State inference and control are learned end-to-end by optimizing the expected free energy. Results show that our model achieves comparable performance to deep Q-learning. However, vanilla dAIF does not reach state-of-the-art performance compared to other world model approaches. Hence, we discuss the current model implementation's limitations and potential architectures to overcome them.

本研究探讨了在没有接触到车的状态的情况下，使用深度主动推理（dAIF）代理在OpenAI的赛车基准测试中的表现，并通过无监督表示学习来学习状态推断和控制，结果表明我们的模型达到了与深度Q学习相当的性能，但与其他一些世界模型方法相比，vanilla dAIF没有达到最先进的性能，本文讨论了当前模型实现的局限性和克服它们的可能架构。

基于像素的离散控制深度主动推理：在汽车赛车问题上的评估