May, 2024

利用大型语言模型启发增强 Q-Learning

TL;DRLLM-guided Q-learning combines the advantages of large language models and Q-learning without introducing performance bias, providing action-level guidance and converting hallucinations into exploration costs, resulting in improved sampling efficiency and suitability for complex control tasks.