BriefGPT.xyz
Jul, 2022
利用文本为视觉表示建立通用领域基础
Grounding Visual Representations with Texts for Domain Generalization
HTML
PDF
Seonwoo Min, Nokyung Park, Siwon Kim, Seunghyun Park, Jinkyu Kim
TL;DR
本文提出了一种基于自然语言监督的跨模态领域泛化方法,利用视觉和文本交互的表征来实现高级别类别判别的信息融合,并使用可解释的模型来生成解释,从而提高模型的泛化能力和性能。作者的方法在多个数据集上均取得了最新领先的结果。
Abstract
Reducing the representational discrepancy between source and target domains is a key component to maximize the model generalization. In this work, we advocate for leveraging
natural language supervision
for the
domain g
→