BriefGPT.xyz
Jun, 2023
基于模型并行交换的分布式深度学习模型服务
Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping
HTML
PDF
Daniel Zou, Xinchen Jin, Xueyang Yu, Hao Zhang, James Demmel
TL;DR
本文介绍了一种名为Computron的系统,它利用内存交换来在共享GPU集群上提供多个分布式模型的服务,实现模型并行交换设计,提高资源利用率。
Abstract
Many of the most performant
deep learning models
today in fields like language and image understanding are
fine-tuned models
that contain billions of parameters. In anticipation of workloads that involve serving
→