学习用户优先的电器调度的奖励函数

Oct, 2023

Learning a Reward Function for User-Preferred Appliance Scheduling

Nikolina Čović, Jochen Cremer, Hrvoje Pandžić

TL;DR通过使用逆强化学习模型，本文提出了一种在不要求用户明确表明需求和意愿的情况下，通过利用用户过去的消费数据创建用户每日家用电器时间表，并激励他们继续参与需求响应服务的提供。

Abstract

Accelerated development of demand response service provision by the residential sector is crucial for reducing carbon-emissions in the pow