会议专题

Dynamically Constructing a Global Schema for Web Entities

With the rapid development of the Internet, popular entities have more and more instances on the Web. It is observed that, on one hand, for the same Web entity, different Web entity instances often contain different attributes, and for the same attribute, different Web entity instances often use different labels; on the other, new Web entity instances which contain new attributes and labels are appearing on the Web. Therefore, it is difficult to dynamically construct a global schema for the Web entities of a given entity type, although the global schema is highly desired in Web entity instances detection, extraction and integration. In this paper, we propose a novel approach to dynamically construct a global schema for the Web entities of a given entity type. First, a SVM(support vector machine) classification model is built based on the Web entity instances which have been extracted from related Web pages. Then, based on this model, a global schema discovery approach is provided to dynamically construct the global schema for target entity type. Experimental results on the Chinese Web sites show that the approach is general and effective.

Web Information Integration Web Entities SVM Global Schema

Xiuxing Xu Qingzhong Li Yongquan Dong Yanhui Ding

School of Computer Science and Technology Shandong University. Jinan, Shandong Province ,250101, P.R. China

国际会议

2010 Seventh Web Information System and Applications Conference(第七届全国web信息系统及其应用学术会议)

呼和浩特

英文

127-131

2010-08-20(万方平台首次上网日期,不代表论文的发表时间)