Traditional language models treat language as a finite state automaton on a
probability space over words. This is a very strong assumption when modeling
something inherently complex such as language. In this paper, we challenge this
by showing how the linear chain assumption inherent i