We consider the problem of designing a sequential decision making agent to maximize an unknown time-varying function which switches with time. At each step, the agent receives an observation of the function's value at a point decided by the agent. The observation could be corrupted by noise. The agent is also constrained to take safe decisions with high probability, i.e., the chosen points should have a function value greater than a threshold. For this switching environment, we propose a policy called Adaptive-SafeOpt and evaluate its performance via simulations. The policy incorporates Bayesian optimization and change point detection for the safe sequential optimization problem. We observe that a major challenge in adapting to the switching change is to identify safe decisions when the change point is detected and prevent attraction to local optima.

我们考虑设计一个顺序决策制定代理，以最大化随时间变化的未知函数，该函数随时间的变化而改变。在每个步骤中，代理接收到一个观测值，该观测值是代理决定的点上函数值的观测结果，并可能被噪音污染。代理还被限制以高概率采取安全决策，即所选择的点的函数值应大于阈值。针对这个切换环境，我们提出了一种名为自适应安全优化的策略，并通过模拟评估了其性能。该策略结合了贝叶斯优化和变点检测，用于解决安全顺序优化问题。我们观察到适应切换变化的一个主要挑战是在检测到变点时识别安全决策并防止陷入局部最优解。

安全的顺序优化在切换环境中