Jun, 2024

反射增强的自我训练语言代理

TL;DRReflection-Reinforced Self-Training (Re-ReST) leverages a reflection model to refine low-quality samples and augment self-training, enhancing the quality of samples efficiently.