BriefGPT.xyz
Oct, 2023
研究 CLIP 模型的限制:最差表现的分类
Investigating the Limitation of CLIP Models: The Worst-Performing Categories
HTML
PDF
Jie-Jing Shao, Jiang-Xin Shi, Xiao-Wen Yang, Lan-Zhe Guo, Yu-Feng Li
TL;DR
通过研究CLIP模型中两种形式的对齐并提出类别匹配边界来解决其性能不足的问题,成功提高了ImageNet上最差10个类别的准确率,无需手动优化或访问标记验证数据。
Abstract
contrastive language-image pre-training
(
clip
) provides a foundation model by integrating natural language into visual concepts, enabling
zero-sh
→