BriefGPT.xyz
Nov, 2016
递归内存寻址描述视频
Recurrent Memory Addressing for describing videos
HTML
PDF
Kumar Krishna Agrawal, Arnav Kumar Jain, Abhinav Agarwalla, Pabitra Mitra
TL;DR
本文提出了Key-Value Memory Networks应用于多模态设置的方法,以及一种新的键寻址机制,将视频字幕生成问题自然地分解为视觉和语言端,将其作为键-值对处理,并在寻址模式下提出了一种递归关注的方法来捕捉语境信息,通过实验发现,这种方法可以提高BLEU@4,METEOR得分,并实现了与最先进方法竞争性能。
Abstract
Deep Neural Network architectures with external memory components allow the model to perform inference and capture long term dependencies, by storing information explicitly. In this paper, we generalize
key-value memory networks
to a
→