BriefGPT.xyz
Jan, 2019
用于GPU推断的OoO VLIW JIT编译器
The OoO VLIW JIT Compiler for GPU Inference
HTML
PDF
Paras Jain, Xiangxi Mo, Ajay Jain, Alexey Tumanov, Joseph E. Gonzalez...
TL;DR
该论文提出了一种基于 VLIW 架构的 JIT 编译器,在满足延迟 SLOs 要求的同时,通过运行时合并和重排执行内核来提高 GPU 的利用效率,并通过比较空间复用和时间复用的低效性,说明了通过空间合并可以达到可观的 7.7x 的机会差距。
Abstract
Current trends in
machine learning
~(ML)
inference
on hardware accelerated devices (e.g., GPUs, TPUs) point to alarmingly low utilization. As ML
i
→