For extended periods of time, sequence generation models rely on beam search
algorithm to generate output sequence. However, the correctness of beam search
degrades when the a model is over-confident about a suboptimal prediction. In
this paper, we propose to perform minimum Bayes-risk