BriefGPT.xyz
May, 2024
LLMs满足多模态生成和编辑的综述
LLMs Meet Multimodal Generation and Editing: A Survey
HTML
PDF
Yingqing He, Zhaoyang Liu, Jingye Chen, Zeyue Tian, Hongyu Liu...
TL;DR
多模态生成技术的调查,介绍了不同领域中的重要进展,包括图像、视频、3D和音频,研究了方法和数据集,还提出了使用现有生成模型进行人机交互的工具增强型多模态代理,同时探讨了人工智能安全问题和新兴应用及未来前景。
Abstract
With the recent advancement in
large language models
(LLMs), there is a growing interest in combining LLMs with multimodal learning. Previous surveys of multimodal
large language models
(MLLMs) mainly focus on un
→