LABEL ORIENTED CLUSTERING FOR SOCIAL NETWORK DISCUSSION GROUPS
This paper proposes applying Bisecting K-means algorithm, to cluster the social network discussion roups and providing a meaningful label to the cluster containing these groups. The clustering of the discussion groups is based on the heterogeneous meta-features that define each group; e.g. title, description, type, subtype, network. The main ideas is to represent each group as a tuple of multiple feature vectors and construct a proper similarity measure to each feature space then perform the clustering using the proposed bisecting K-means clustering algorithm. The main key phrases are extracted from the titles and descriptions of the discussion groups of a given cluster and combined with the main meta-features to build a phrase label of the cluster. The analysis of the experiments results showed that combining more than one feature produced better clustering in terms of quality and interrelationship between the discussion groups of a given cluster. Some features like the Network improved the compactness and tightness of the cluster objects within the clusters while other features like the type and subtype improves the separation of the clusters.
Clustering Bisecting K-mean algorithm Social network Discussion groups
Ahmed Rafea Ahmed El Kholy Sherif G. Aly
Computer Science and Engineering Department, School of Science and Engineering, American University in Cairo New Cairo, Egypt
国际会议
13th International Conference on Enterprise Information System(第13届企业信息系统国际会议 ICEIS 2011)
北京
英文
2091-2096
2011-06-08(万方平台首次上网日期,不代表论文的发表时间)