We devise an online learning algorithm -- titled Switching via Monotone
Adapted Regret Traces (SMART) -- that adapts to the data and achieves regret
that is instance optimal, i.e., simultaneously competitive on every input
sequence compared to the performance of the follow-the-leader (FTL) policy and
the worst case guarantee of any other input policy. We show that the regret of
the SMART policy on any input sequence is within a multiplicative factor
$e/(e-1) \approx 1.58$ of the smaller of: 1) the regret obtained by FTL on the
sequence, and 2) the upper bound on regret guaranteed by the given worst-case
policy. This implies a strictly stronger guarantee than typical
`best-of-both-worlds' bounds as the guarantee holds for every input sequence
regardless of how it is generated. SMART is simple to implement as it begins by
playing FTL and switches at most once during the time horizon to the worst-case
algorithm. Our approach and results follow from an operational reduction of
instance optimal online learning to competitive analysis for the ski-rental
problem. We complement our competitive ratio upper bounds with a fundamental
lower bound showing that over all input sequences, no algorithm can get better
than a $1.43$-fraction of the minimum regret achieved by FTL and the
minimax-optimal policy. We also present a modification of SMART that combines
FTL with a ``small-loss" algorithm to achieve instance optimality between the
regret of FTL and the small loss regret bound.

我们提出了一种在线学习算法 —— 通过单调适应性遗憾追踪（SMART）进行切换，它适应数据并实现了在每个输入序列上相对于领导者跟随（FTL）策略的表现和任何其他输入策略的最坏情况保证同时有效的遗憾，通过我们的算法，我们证明 SMART 政策在任何输入序列上的遗憾在与 FTL 获得的遗憾和给定最坏情况策略保证的遺憾上都在乘法因子 e/(e-1)≈1.58 的范围内，同时它是简单易实施的，并通过一种基本的分析方法证明了实例上在线学习相对于滑雪租赁问题的竞争分析的可行性，我们还提出了 SMART 的一个修改版本，通过将 FTL 与 “小损失” 算法相结合，实现了在 FTL 和小损失遗憾上的实例最优性。