Validity of Cluster Technique for Genome Expression Data
With the rapid development of the database technology and the wide application of DBMS, people have more and more data containing a great amount of valuable information. People want to deepen the analysis of the data, which helps people make batter use of these data information. Now Database System can realize data input, search and statistics, etc, but it cant forecast the development trend of future data stored in the database. The short of measures to mine knowledge hiding behind the data results in the phenomenon which is that there is a large amount of data but poor knowledge. In the era of computer network, it has been a focus of attention that How we obtain knowledge from large data effectively and rapidly. The abilities of data acquiring has been increasingly incompatible with the abilities of data analysis, so an automatic technology that can process data in a deeper level is needed. Data Mining is such a technology. As an important branch of data mining, clustering analysis, which can be an independent data-mining tool or preprocessing procedures of other data-mining algorithms, is attracting wide attention. Clustering is an unsupervised classification, and it is an important method with which people know the society and nature. As one of the most important components of data mining, clustering has been widely used in biological science. Several clustering algorithms have been suggested to analyse genome expression data, but fewer solutions have been implemented to guide the design of clustering-based experiments and assess the quality of their outcomes. A cluster validity framework provides insights into the problem of predicting the correct the number of clusters. This paper presents several validation techniques for gene expression data analysis. Normalization and validity aggregation strategies are proposed to improve the prediction about the number of relevant clusters. The results obtained indicate that this systematic evaluation approach may significantly support genome expression analyses for knowledge discovery applications.
Genome expression Clustering Cluster validation Genomic data mining.
Xiao Zhang Aichen Li You Zhang Yongpeng Xiao
School of Computer Science and Information Technology, Northeast Normal University, Changchun, 130117, China.
国际会议
The 24th Chinese Control and Decision Conference (第24届中国控制与决策学术年会 2012 CCDC)
太原
英文
3754-3758
2012-05-23(万方平台首次上网日期,不代表论文的发表时间)