BriefGPT.xyz
Apr, 2025
YoChameleon:个性化视觉和语言生成
YoChameleon: Personalized Vision and Language Generation
HTML
PDF
Thao Nguyen, Krishna Kumar Singh, Jing Shi, Trung Bui, Yong Jae Lee...
TL;DR
本研究解决了大型多模态模型在个性化知识上的不足,特别是在图像生成方面的应用。提出的Yo'Chameleon利用软提示调整技术,能够针对特定概念结合用户信息进行多模态回答和图像生成。研究表明,该方法能够在少量样本条件下提高图像质量,并在多个模态中保持良好的性能。
Abstract
Large
Multimodal Models
(e.g., GPT-4, Gemini, Chameleon) have evolved into powerful tools with millions of users. However, they remain generic models and lack personalized knowledge of specific user concepts. Previous work has explored
→