When a person is not satisfied with how a robot performs a task, they can
intervene to correct it. reward learning methods enable the robot to adapt its
reward function online based on such human input, but they rely on handcrafted
features. When the correction cannot be explained by t