Gene Ezpression Analysis Using Clustering

摘要：

Data Mining has become an important topic in effective analysis of gene expression data due to its wide application in the biomedical industry. In this paper, k-means clustering algorithm has been extensively studied for gene expression analysis. Since our purpose is to demonstrate the effectiveness of the k-means algorithm for a wide variety of data sets, we have chosen two pattern recognition data and thirteen microarray data sets with both overlapping and non-overlapping cluster boundaries, where the number of features/genes ranges from 4 to 7129 and number of sample ranges from 32 to 683. The number of clusters ranges from two to eleven. We use the clustering error rate (or, clustering accuracy) as evaluation metrics to measure the performance of k-means algorithm.

关键词： Bio-informatics Cancer-Genomics Geneezpression Clustering Data-mining Microarray

作者: Kumar Dhiraj Santanu Kumar Rath Abhishek Pandey

作者单位: Dept of Computer science and Engineering,National Institute of Technology Rourkela Rourkela,Orissa-769008,INDIA

会议类型: 国际会议

会议名称: The 3rd International Conference on Bioinformatics and Biomedical Engineering(iCBBE 2009)(第三届生物信息与生物医学工程国际会议)

会议地点: 北京

会议语种:英文

页码: 1-4

在线出版日期: 2009-06-11（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Gene Ezpression Analysis Using Clustering