Capturing the meaning of sentences has long been a challenging task. Current
models tend to apply linear combinations of word features to conduct semantic
composition for bigger-granularity units e.g. phrases, sentences, and
documents. However, the semantic linearity does not always hold in human
language. For instance, the meaning of the phrase `ivory tower