BriefGPT.xyz
Aug, 2018
上下文词嵌入的解析:架构和表示
Dissecting Contextual Word Embeddings: Architecture and Representation
HTML
PDF
Matthew E. Peters, Mark Neumann, Luke Zettlemoyer, Wen-tau Yih
TL;DR
本文通过详细的实证研究探讨了神经网络模型架构(如:LSTM、CNN或自我注意力)对端到端NLP任务准确性和语言表示质量影响的权衡,研究结果表明预训练的双向语言模型可以学习到关于语言结构比以往认为的更多,无论采用何种架构,都是学习到高质量的上下文表示。
Abstract
Contextual word representations derived from
pre-trained bidirectional language models
(biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of
nlp tasks
. H
→