An AI system can create and maintain knowledge only to the extent that it can verify that knowledge itself. Recent work on Long Chain-of-Thought reasoning has demonstrated great potential of LLMs on solving competitive problems, but their verification ability remains to be weak and not