A key challenge for modern Bayesian statistics is how to perform scalable inference of posterior distributions. To address this challenge, VB methods have emerged as a popular alternative to the classical MCMC methods. VB methods tend to be faster while achieving comparable predictive performance. However, there are few theoretical results around VB. In this