BriefGPT.xyz
Sep, 2024
ASTRA:基于精确可扩展的近似最近邻算法训练极端分类器
ASTRA: Accurate and Scalable ANNS-based Training of Extreme Classifiers
HTML
PDF
Sonu Mehta, Jayashree Mohan, Nagarajan Natarajan, Ramachandran Ramjee, Manik Varma
TL;DR
本研究针对极端分类任务面临的训练效率和准确性问题,提出了一种新的算法ASTRA。该算法通过结合重要性采样和均匀采样的负样本策略,实现了更高的精准度,并在使用120M标签的数据集上,缩短了训练时间4到15倍,推动了极端分类的技术进步。
Abstract
`
Extreme Classification
'' (or XC) is the task of annotating data points (queries) with relevant labels (documents), from an extremely large set of $L$ possible labels, arising in search and recommendations. The most successful
→