TL;DR本研究利用基于门控交互的 LSTM 拆分相依度 DI 度量,探索 LSTM 构成性行为的序列表现层次结构,发现 DI 与语法距离有关;为探索这些构成性表现在训练过程中的归纳偏差,进行了简单的合成数据实验,支持一种关于如何自底向上学习层次结构的假设。
Abstract
Recent work in NLP shows that lstm language models capture hierarchical
structure in language data. In contrast to existing work, we consider the
\textit{learning} process that leads to their compositional behavior. For a
closer look at how an LSTM's sequential representations are comp