BriefGPT.xyz
Dec, 2021
交互式决策的统计复杂度
The Statistical Complexity of Interactive Decision Making
HTML
PDF
Dylan J. Foster, Sham M. Kakade, Jian Qian, Alexander Rakhlin
TL;DR
提供决策-评估系数,作为评估交互式学习复杂度的量,从而实现与样本效率无关的最佳后悔,同时引入了一种新的选择Estimation-to-Decisions(E2D),使得监督学习的算法适应于在线决策,从而实现了准确的与样本效率无关的学习,在强化学习中,该决策-评估系数可以快速恢复现有的大多数困难结果和下限。
Abstract
A fundamental challenge in
interactive learning
and decision making, ranging from bandit problems to
reinforcement learning
, is to provide
sample
→