We consider a sequential subset selection problem under parameter
uncertainty, where at each time step, the decision maker selects a subset of
cardinality $K$ from $N$ possible items (arms), and observes a (bandit)
feedback in the form of the index of one of the items in said subset, o