Research of Text Clustering Based on Hybrid Parallel Genetic Algorithm
K-means Clustering Algorithm is sensitive to the choice of the initial cluster center, and easy to fall into a local optimal solution. In order to avoid this kind of flaw, we proposed Hybrid Parallel Genetic Algorithm. In this method, we expressed the documents set into Vector Space Model and randomly chose initial clustering center to form chromosome among document vectors. Combined with the efficiency of K-means Algorithm and the global optimization ability of Parallel Genetic Algorithm, we can provide a higher efficiency and precision for text clustering by means of heredity, mutation in the community, and parallel evolution, intermarriage among communities. Experiments indicate that Hybrid Parallel Genetic Algorithm has higher accuracy and global optimization ability than the other text clustering methods like K-means Algorithm, Genetic Algorithm and so on.
Parallel Genetic Algorithm, K-means Clustering, Text Clustering Vector Space Model, Feature Extraction
Wenhua Dai Cuizhen Jiao Tingting He
Department of Computer Science, Central China Normal University, Wuhan 430079, China;Department of C Department of Computer Science, Central China Normal University, Wuhan 430079, China Department of Computer, Xianning College, Xianning 437005, China
国际会议
武汉
英文
2007-09-21(万方平台首次上网日期,不代表论文的发表时间)