强化学习中的观测过拟合

Dec, 2019

Observational Overfitting in Reinforcement Learning

Xingyou Song, Yiding Jiang, Yilun Du, Behnam Neyshabur

TL;DR本研究提供了一个分析模型自由的强化学习中可能出现过度拟合的情形的框架，我们对观测空间进行修改以设计多个综合性的基准测试，并通过实验展示了与隐式规范和泛化性之间的关联

Abstract

A major component of overfitting in model-free reinforcement learning (RL) involves the case where the agent may mistakenly correlate reward with certain spurious features from the observations generated by the M