In this paper, we propose a novel approach to Bayesian Experimental Design
(BED) for non-exchangeable data that formulates it as risk-sensitive policy
optimization. We develop the Inside-Out SMC^2 algorithm that uses a nested
sequential Monte Carlo (SMC) estimator of the expected information gain and
embeds it into a particle Markov chain Monte Carlo (pMCMC) framework to perform
gradient-based policy optimization. This is in contrast to recent approaches
that rely on biased estimators of the expected information gain (EIG) to
amortize the cost of experiments by learning a design policy in advance.
Numerical validation on a set of dynamical systems showcases the efficacy of
our method in comparison to other state-of-the-art strategies.

本文提出了一种新颖的贝叶斯实验设计方法，将其作为风险敏感型策略优化来进行，我们开发了一种内外 SMC^2 算法，用嵌套顺序蒙特卡洛估计器来估计预期信息增益，并将其嵌入到一个粒子马尔可夫链蒙特卡洛框架中进行基于梯度的策略优化，与近期的方法相比，我们的方法不依赖于对预期信息增益使用偏差估计器来通过预先学习设计策略来摊销实验成本，在一组动力系统上的数值验证展示了我们方法的有效性与其他最先进策略的比较。