In this paper, we propose a novel method for speaker adaptation in lip
reading, motivated by two observations. Firstly, a speaker's own
characteristics can always be portrayed well by his/her few facial images or
even a single image with shallow networks, while the fine-grained dynamic