BriefGPT.xyz
Jul, 2020
INT: 一个不等式基准用于评估定理证明中的泛化能力
INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving
HTML
PDF
Yuhuai Wu, Albert Jiang, Jimmy Ba, Roger Grosse
TL;DR
介绍了一个基于不等式定理证明的学习辅助定理证明基准,旨在测试代理程序的泛化能力。该基准提供了轻量级且用户友好的定理证明环境,并评估了基于学习的基线和Monte Carlo树搜索。
Abstract
In
learning-assisted theorem proving
, one of the most critical challenges is to generalize to theorems unlike those seen at training time. In this paper, we introduce INT, an
inequality theorem
proving benchmark,
→