Automatically Refining the Wikipedia Infobox Ontology

摘要：

The combined efforts of human volunteers have recently extracted numerous facts fromWikipedia, storing them as machine-harvestable object-attribute-value triples inWikipedia infoboxes. Machine learning systems, such as Kylin, use these infoboxes as training data, accurately extracting even more semantic knowledge from natural language text. But in order to realize the full power of this information, it must be situated in a cleanly-structured ontology. This paper introduces KOG, an autonomous system for refining Wikipedia’s infobox-class ontology towards this end. We cast the problem of ontology refinement as a machine learning problem and solve it using both SVMs and a more powerful joint-inference approach expressed in Markov Logic Networks. We present experiments demonstrating the superiority of the joint-inference approach and evaluating other aspects of our system. Using these techniques, we build a rich ontology, integratingWikipedia’s infobox-class schemata withWordNet. We demonstrate how the resulting ontology may be used to enhance Wikipedia with improved query processing and other features.

关键词： Semantic Web Ontology Wikipedia Markov Logic Networks

作者: Fei Wu Daniel S. Weld

作者单位: Computer Science & Engineering Department,University of Washington, Seattle, WA, USA Computer Science & Engineering Department, University of Washington, Seattle, WA, USA

会议类型: 国际会议

会议名称: 第十七届国际万维网大会(the 17th International World Wide Web Conference)(WWW08)

会议地点: 北京

会议语种:英文

在线出版日期: 2008-04-21（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Automatically Refining the Wikipedia Infobox Ontology