Clustering Deep Web Databases Semantically
Deep Web database clustering is a key operation in organizing Deep Web resources.Cosine similarity in Vector Space Model (VSM) is used as thesimilarity computation in traditional ways.However it cannot denote the semantic similarity between the contents of two databases.In this paper how to cluster Deep Web databases semantically is discussed.Firstly,a fuzzy semanticmeasure,which integrates ontology and fuzzy set theory to compute semantic similarity between the visible features of two Deep Web forms,is proposed,and then a hybrid Panicle Swarm Optimization (PSO) algorithm is provided for Deep Web databases clustering.Finally the clustering results are evaluated according to Average Similarity of Document to the Cluster Centroid (ASDC) and Rand Index (RI).Experiments show that: I) the hybrid PSO approach has the higher ASDC values than those based on PSO and K-Means approaches.It means the hybrid PSO approach has the higher intra cluster similarity and lowest inter cluster similarity; 2) the clustering results based on fuzzy semantic similarity have higher ASDC values and higher RI values than those based on cosine similarity.It reflects the conclusion that the fuzzy semantic similarity approach can explore latent semantics.
Semantic Deep Web clustering Fuzzy set Ontology PSO K-Means
Ling Song Jun Ma Po Yan Li Lian Dongmei Zhang
School of Computer Science &Technology,Shandong University,250061.China;School of Computer Science & School of Computer Science &Technology,Shandong University,250061.China
国际会议
4th Asia Information Retrieval Symposium(AIRS 2008)(第四届亚洲信息检索研讨会)
哈尔滨
英文
365-376
2008-01-16(万方平台首次上网日期,不代表论文的发表时间)