会议专题

A Valid Clustering Algorithm for High-dimensional Large Data sets Based on Distributed Method

Data sets are randomly divided Into several subsets, then fuzzy clustering method for A highdimensional datas based on genetic algorithm is proposed to duster the subsets, by Importing a fuzzy dissimilar matrix to express the dissimilar degree between any two datas,and initializing the high-dimensional samples to two-dimensional plane.Then iteratively optimize the coordinate value of twodimensional plane using genetic algorithm,which makes the Euclidean distance between the two-dimensional plane approximate to the fuzzy dissimilar degree between samples gradually.At last cluster the two-dimensional datas using FCM algorithm,so avoid dependence of clustering validity on the space distribution of high-dimensional samples.Experimental results show the method has high quality result,and Improves the clustering speed greatly.

fuzzy clustering distributed method genetic algorithm fuzzy dissimilar matrix large data sets high dimension

GUO Xian e YAN Junmei

Mathmatie and Computer Science Institution, Datong,Shanxi

国际会议

2009 International Workshop on Information Security and Application(2009 信息安全与应用国际研讨会)

青岛

英文

1-6

2009-11-21(万方平台首次上网日期,不代表论文的发表时间)