One Optimized Choosing Method of K-Means Document Clustering Center
A center choice method based on sub-graph division is presented.After constructing the similarity matrix,the disconnected graphs can be established taking the text node as the vertex of the graph and then it will be analyzed.The number of the clustering center and the clustering center can be confirmed automatically on the error allowable range by this method.The noise data can be eliminated effectively in the process of finding clustering center.The experiment results of the two documents show that this method is effective.Compared with the tradition methods,F-Measure is increased by 8%.
Document Clustering K-means Initial Center Sub-graph Division
Hongguang Suo Kunming Nie Xin Sun Yuwei Wang
School of Computer and Communication Engineering,China University of Petroleum,Dongying,China
国际会议
4th Asia Information Retrieval Symposium(AIRS 2008)(第四届亚洲信息检索研讨会)
哈尔滨
英文
490-495
2008-01-16(万方平台首次上网日期,不代表论文的发表时间)