In the field of online sequential decision-making, we address the problem with delays utilizing the framework of online convex optimization (OCO), where the feedback of a decision can arrive with an unknown delay. Unlike previous research that is limited to Euclidean norm and gradient information, we propose three families of delayed algorithms based on approximate solutions to handle different types of received feedback. Our proposed algorithms are versatile and applicable to universal norms. Specifically, we introduce a family of Follow the Delayed Regularized Leader algorithms for feedback with full information on the loss function, a family of Delayed Mirror Descent algorithms for feedback with gradient information on the loss function and a family of Simplified Delayed Mirror Descent algorithms for feedback with the value information of the loss function's gradients at corresponding decision points. For each type of algorithm, we provide corresponding regret bounds under cases of general convexity and relative strong convexity, respectively. We also demonstrate the efficiency of each algorithm under different norms through concrete examples. Furthermore, our theoretical results are consistent with the current best bounds when degenerated to standard settings.

在在线顺序决策的领域中，我们利用在线凸优化（OCO）框架解决带有延迟的问题，其中决策的反馈可能会有未知的延迟。我们提出了三类基于近似解的延迟算法，以处理不同类型的接收反馈。我们提出的算法多功能且适用于通用范数，在每种算法类型下给出了相应的遗憾界限。我们通过具体示例展示了每种算法在不同范数下的效率，并且理论结果在标准设置下与当前最佳界限一致。

具有未知延迟的在线顺序决策