BriefGPT.xyz
Apr, 2025
基米音频技术报告
Kimi-Audio Technical Report
HTML
PDF
KimiTeam, Ding Ding, Zeqian Ju, Yichong Leng, Songxiang Liu...
TL;DR
本研究解决了音频理解、生成和对话的不足,提出了开源音频基础模型Kimi-Audio。它采用了一种新颖的基于LLM的架构,并在超过1300万小时的多模态音频数据上进行训练,取得了在多个音频基准测试中领先的性能,具有广泛的应用潜力。
Abstract
We present Kimi-Audio, an open-source
Audio Foundation Model
that excels in
Audio Understanding
, generation, and conversation. We detail the practices in building Kimi-Audio, including model architecture, data cu
→