BriefGPT.xyz
Mar, 2024
眼球注视导向的多模态对齐框架用于放射学
Eye-gaze Guided Multi-modal Alignment Framework for Radiology
HTML
PDF
Chong Ma, Hanqi Jiang, Wenting Chen, Zihao Wu, Xiaowei Yu...
TL;DR
使用眼动数据来辅助图像和文本特征的对齐,以减少对手动注释的依赖和降低培训成本。同时,探讨了不同量的眼动数据对模型性能的影响,突显将此辅助数据整合到多模态预训练中的可行性和实用性。
Abstract
In
multi-modal frameworks
, the
alignment
of
cross-modal features
presents a significant challenge. The predominant approach in multi-modal
→