Consider the sequential optimization of an unknown, expensive to evaluate and possibly non-convex objective function $f$ from noisy observations which can be considered as a continuum-armed bandit problem. Bayesian optimization algorithms based on Gaussian Process (GP) models are shown to perform favorably in this setting. In particular, upper bounds are proven on the regret performance of two popular algorithms $-$ GP-UCB and GP-TS $-$ under both Bayesian (when $f$ is a sample from a GP) and frequentist (when $f$ lives in a reproducing kernel Hilbert space) settings. The regret bounds crucially depend on a quantity referred to as the maximal information gain $\gamma_T$ between $T(\in \mathbb{N})$ observations and the underlying GP (surrogate) model. In this paper, we build on the spectral properties of positive definite kernels to prove novel bounds on $\gamma_T$. In comparison to the existing works which rely on specific kernels (such as Mat\'ern and SE) to provide explicit bounds on $\gamma_T$ and regret, we provide general results in terms of the decay rate of the eigenvalues of the kernel. Specialising our results for common kernels leads to significant improvements over the existing bounds on $\gamma_T$ and regret. For the Mat\'ern and SE kernels, where the lower bounds on regret are known, our results reduce the gap between the upper and lower bounds from a polynomial in $T$ factor, in the existing work, to a logarithmic one, under the Bayesian setting. Furthermore, since our bounds on $\gamma_T$ are independent of the optimisation algorithm, they impact the regret bounds under various other settings where $\gamma_T$ is essential.

研究连续性赌博机问题下高斯过程与多种学习算法（GP-UCB、GP-TS）的误差性能，通过独立的贝叶斯和频率学派来分析多项式差距，得出了均价核的特殊化，进一步提高了误差性能。

高斯过程赌博中的信息增益与遗憾界限