Automatic Structured Web Databases Classification
The growing structured Web databases on the web, making large-scale Deep Web data integration faces enormous challenges. Organizing such structured web databases into a hierarchy directory tree is one of critical step towards the large-scale integration of Deep Web. In this paper, a method for automatic classification of Web database is addressed. Firstly, the method for calculating the semantic similarities among the Web databases based on their interface schemas is proposed and be translated to the problem of extended optimal matching for bipartite graph. Then based on the achieved similarity matrix, an agglomerative hierarchical clustering algorithm is proposed, which can organize the Web databases into a hierarchy tree automatically. Theoretical analysis and experimental results show that the method is efficient.
web databases interface schema bipartite graph matching hierarchical clustering
XiaoJun Cui ZhongSheng Ren HongYu Xiao Le Xu
Wenzhou Vocational College of Science and Technology Wenzhou, China State Key Laboratory of Software College of Mathematics, and Computer Science Fujian Normal University Fuzhou, China Department of Information and Technology Wenzhou Vocational College of Science and Technology Wenzho
国际会议
厦门
英文
305-309
2010-10-29(万方平台首次上网日期,不代表论文的发表时间)