BriefGPT.xyz
Dec, 2023
大型语言模型的公平性服务
Fairness in Serving Large Language Models
HTML
PDF
Ying Sheng, Shiyi Cao, Dacheng Li, Banghua Zhu, Zhuohan Li...
TL;DR
该论文介绍了基于成本函数的LLM服务公平性定义,并提出了一种新颖的调度算法,基于连续批处理机制的Virtual Token Counter(VTC),通过大量实验验证了VTC在确保公平性方面的卓越性能,特别是相对于其他基线方法在各种条件下的不足之处。
Abstract
High-demand
llm inference services
(e.g., ChatGPT and BARD) support a wide range of requests from short chat conversations to long document reading. To ensure that all client requests are processed fairly, most major
ll
→