Recent advances in machine learning have shown that Reinforcement Learning
from Human Feedback (RLHF) can improve machine learning models and align them
with human preferences. Although very successful for Large Language Models
(LLMs), these advancements have not had a comparable impact in research for
autonomous vehicles -- where alignment with human expect
通过创新地结合强化学习(Reinforcement Learning from Human Feedback,RLHF)和大语言模型(Large Language Models,LLMs),以提升自动驾驶的安全性。我们利用多个人工控制的代理,如汽车和行人,来模拟真实道路环境,将物理和生理反馈与 LLMs 集成,优化自动驾驶模型的微调过程,并通过在新泽西和纽约市的真实测试平台上收集的数据来验证我们的模型。