Application of Gaussian Mixture Model Genetic Algorithm in Data Stream Clustering Analysis
Data stream is infinite data and quick stream speed, so traditional clustering algorithm can not be applied to data stream clustering directly. As an efficient tool for data analysis, Gaussian mixture model has been widely applied in the fields of signal and information processing. We can use Gaussian mixture model (GMM) simulate arbitrary clustering graphics. There are two critical problems for the clustering analysis technology to select the appropriate value of number of clusters and partition overlapping clusters. Base on an extending method of Gaussian mixture modeling, a new feature mining method named Gaussian Mixture Model with Genetic Algorithms is proposed in this paper. This method is use a probability density based data stream clustering which requires only the newly arrived data, not the entire historical data, and also can choose optimal estimation clusters number value. The algorithm can determine the number of Gaussian clusters and the parameters of each Gaussian through random split and merge operation of Genetic Algorithms. We can get the accurate information each attribute characteristic describe. So that can make an effective date stream mining.
Data stream clustering Gaussian Mixture Model Genetic Algorithms clusters number
GAO Ming-ming Chang Tai-hua GAO Xiang-xiang
School of Control and Computer Engineering North China Electric Power University Beijing,China China North Vehicle research institute Beijing, China
国际会议
厦门
英文
786-790
2010-10-29(万方平台首次上网日期,不代表论文的发表时间)