BriefGPT.xyz
Jul, 2021
面向结构模型视频预测的可解释潜空间
Towards an Interpretable Latent Space in Structured Models for Video Prediction
HTML
PDF
Rushil Gupta, Vishal Sharma, Yash Jain, Yitao Liang, Guy Van den Broeck...
TL;DR
我们提出了一种物体为中心的模型,使用图神经网络中的对比学习在潜在空间中预测未来的状态,并注入了显式归纳偏置以帮助提高模型的预测准确性。我们的模型不仅可捕捉物体交互作用,而且能够提高物体位置的定位能力,且实验表明我们的模型在多个领域中具有显著的优势。
Abstract
We focus on the task of
future frame prediction
in video governed by underlying physical dynamics. We work with models which are object-centric, i.e., explicitly work with object representations, and propagate a loss in the
→