Gene Ezpression Analysis Using Clustering
Data Mining has become an important topic in effective analysis of gene expression data due to its wide application in the biomedical industry. In this paper, k-means clustering algorithm has been extensively studied for gene expression analysis. Since our purpose is to demonstrate the effectiveness of the k-means algorithm for a wide variety of data sets, we have chosen two pattern recognition data and thirteen microarray data sets with both overlapping and non-overlapping cluster boundaries, where the number of features/genes ranges from 4 to 7129 and number of sample ranges from 32 to 683. The number of clusters ranges from two to eleven. We use the clustering error rate (or, clustering accuracy) as evaluation metrics to measure the performance of k-means algorithm.
Bio-informatics Cancer-Genomics Geneezpression Clustering Data-mining Microarray
Kumar Dhiraj Santanu Kumar Rath Abhishek Pandey
Dept of Computer science and Engineering,National Institute of Technology Rourkela Rourkela,Orissa-769008,INDIA
国际会议
北京
英文
1-4
2009-06-11(万方平台首次上网日期,不代表论文的发表时间)