BriefGPT.xyz
Nov, 2023
信号时态逻辑导引的学徒学习
Signal Temporal Logic-Guided Apprenticeship Learning
HTML
PDF
Aniruddh G. Puranic, Jyotirmoy V. Deshmukh, Stefanos Nikolaidis
TL;DR
通过将描述高层任务目标的时间逻辑规范编码为图形来定义基于时间的度量,以改进推断奖励和策略的质量,实验表明我们的框架通过极大地提高学习控制策略所需的演示数量,克服了之前文献的缺点。
Abstract
apprenticeship learning
crucially depends on effectively learning
rewards
, and hence
control policies
from user demonstrations. Of particu
→