BriefGPT.xyz
Ask
alpha
关键词
reinforced learning from human feedback
搜索结果 - 1
KwaiYiiMath 技术报告
KwaiYiiMath enhances mathematical reasoning abilities by applying Supervised Fine-Tuning and Reinforced Learning from Hu
→
PDF
9 months ago
Prev
Next