BriefGPT.xyz
Jun, 2023
随着规模变化调整:用于计算效率训练的超参数优化
Tune As You Scale: Hyperparameter Optimization For Compute Efficient Training
HTML
PDF
Abraham J. Fetterman, Ellie Kitanidis, Joshua Albrecht, Zachary Polizzi, Bryden Fogelman...
TL;DR
本文提出了一种名为“CARBS”的贝叶斯优化算法,通过在性能成本 Pareto 前沿周围进行本地搜索,解决了大规模深度学习模型参数调优的难题,并自动化了调优的“黑魔法”,可以适用于任何深度学习问题,并发现了发现各种超参数的标度律,使得调优更加高效。
Abstract
hyperparameter tuning
of
deep learning
models can lead to order-of-magnitude performance gains for the same amount of compute. Despite this, systematic tuning is uncommon, particularly for large models, which are
→