Matthew J. Filipovich, Alessandro Cappelli, Daniel Hesslow, Julien Launay
TL;DR使用缩放定律研究了 Direct Feedback Alignment(DFA)在训练因果解码器专用变压器的效率,在计算和数据需求方面没有超越反向传播,需要更多实证方法来更好地理解建模决策。
Abstract
Alternatives to backpropagation have long been studied to better understand
how biological brains may learn. Recently, they have also garnered interest as
a way to train neural networks more efficiently. By relax