We study the batched best arm identification (BBAI) problem, where the
learner's goal is to identify the best arm while switching the policy as less
as possible. In particular, we aim to find the best arm with probability
$1-\delta$ for some small constant $\delta>0$ while minimizing b