In this paper, we propose a method to reprogram pre-trained audio-driven
talking face synthesis models to be able to operate with text inputs. As the
audio-driven talking face synthesis model takes speech audio as inputs, in
order to generate a talking avatar with the desired speech content, speech
recording needs to be performed in advance. However, this is