BriefGPT.xyz
May, 2023
BigVideo:一份大规模视频字幕翻译数据集,用于多模式机器翻译
BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine Translation
HTML
PDF
Liyan Kang, Luyang Huang, Ningxin Peng, Peihao Zhu, Zewei Sun...
TL;DR
本研究提出了一个大规模的视频字幕翻译数据集BigVideo,用于促进多模态机器翻译的研究;在跨模态编码器中引入了对比学习方法,结果表明视觉信息能够显著提高NMT模型的性能并帮助消除歧义。
Abstract
We present a large-scale
video subtitle translation
dataset, BigVideo, to facilitate the study of
multi-modality machine translation
. Compared with the widely used How2 and VaTeX datasets, BigVideo is more than 1
→