Jul, 2018
为程序综合和语义解析优化的记忆增强策略
Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing
Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc Le, Ni Lao
TL;DRMemory Augmented Policy Optimization (MAPO) improves policy gradient's sample efficiency and robustness on tasks with sparse rewards. When applied to weakly supervised program synthesis from natural language, it achieves state-of-the-art accuracy with only weak supervision.