会议专题

A Residue-Based Cluster Validity Indez for Gene Ezpression Data Biclustering

Biclustering consists in simultaneous partitioning of the set of genes and the set of their conditions into biclusters using the gene expression data. In theory, the automated variable weighting K-means clustering algorithm (W-K-means) is proper to conduct the biclustering issue. However, it is critical for the W-K-means algorithm to assign the number of biclusters, K, because the quality of biclustering result highly depends on the parameter setting. In this paper, we proposed a novel residue-based cluster validity index to determine the K value. The residue is an indicator of the coherence degree of its corresponding expression level with respect to remaining expression levels within a bicluster. The evaluation of coherent tendency using residues is easier than that using expression levels, so analyzing the Mean Squared Residue (MSR) model which takes the residue into account is helpful for the biclustering issue. The main concept of our proposed index lies in translating the result of the W-K-means algorithm, including the gene-bicluster membership matrix and the conditionbicluster membership matrix, to match the mean squared residue (MSR) model. Therefore, the appropriate number of biclusters generated by the W-K-means algorithm can be determined based on the MSR model so that the determination result becomes meaningful and reasonable.

biclustering cluster validity indez residue mean squared residue model W-K-means algotirhm

Chieh-Yuan Tsai Chuang-Cheng Chiu

Department of Industrial Engineering and Management Yuan Ze University Chung-Li,Taiwan

国际会议

The 3rd International Conference on Bioinformatics and Biomedical Engineering(iCBBE 2009)(第三届生物信息与生物医学工程国际会议)

北京

英文

1-4

2009-06-11(万方平台首次上网日期,不代表论文的发表时间)