Nicolas Tempelmeier, Elena Demidova, Stefan Dietze
TL;DR研究发现 Web 标记有大量缺失信息,本文提出了一种基于监督学习的方法,通过获取分类属性数据提高节点信息的描述性,最终取得了 79% 和 83% 的准确率。
Abstract
Embedded markup of Web pages has seen widespread adoption throughout the past
years driven by standards such as RDFa and Microdata and initiatives such as
schema.org, where recent studies show an adoption by 39% of all Web pages
already in 2016. While this constitutes an important information source for
tasks such as Web search, Web page classification or kn