BriefGPT.xyz
May, 2018
通过策略提取实现可验证的强化学习
Verifiable Reinforcement Learning via Policy Extraction
HTML
PDF
Osbert Bastani, Yewen Pu, Armando Solar-Lezama
TL;DR
使用 VIPER 算法训练决策树策略来增强强化学习的安全性和验证性,它相对于其他算法在 Atari Pong 和 cart-pole 这两项任务上都有着可靠的表现。
Abstract
While deep
reinforcement learning
has successfully solved many challenging control tasks, its real-world applicability has been limited by the inability to ensure the
safety
of learned policies. We propose an app
→