We consider a setting where a system learns to rank a fixed set of m items. The goal is produce a good ranking for users with diverse interests who interact with the system for T rounds in an online fashion. We consider a novel top-1 feedback model for this problem: at the end of each