会议专题

Extracting and Clustering Method of Web Bipartite Cores

The paper focuses on some key problems in Web communities discovery. Based on topic-oriented communities discovery, we analyze some insufficiencies of CBG(complete bipartite graph) in trawling method. The conception of x-core-set is introduced, instead of CBG, it is more reasonable as a signature of core of community. We construct a bipartite graph from a node x and then (i, j)pruning the graph to obtain x-cores-set. By scanning topic subgraph, we can extract a set of x-cores-sets. Finally, a hierarchal clustering algorithm is applied to these x-cores-sets and the dendrogram of community is formed. We proved that x-cores-set. consisted of x-cores, can be calculated by a bipartite graph collected from x and (i, j)pruning. The experiment is set up on the dataset that is same as that in HITS method, except for returned pages are integrated from 4 search engines. The result shows that our algorithm is effective and efficient.

web communities bipartite cores hiearachical clustering

Nan Yang Hui Ding Yue Liu

The Information School Rennin University of China Beijing, China The Information School Renmin University Beijing, China

国际会议

2010 Seventh Web Information System and Applications Conference(第七届全国web信息系统及其应用学术会议)

呼和浩特

英文

29-34

2010-08-20(万方平台首次上网日期,不代表论文的发表时间)