Incorporating Protein-protein Interactions Knowledge in Clustering Gene Ezpression Data
In this paper, a similarity measure between genes with protein-protein interactions is proposed. On the basis of it, the combined dissimilarity measure is defined. The combined distance measure is introduced into K-means method, which can be considered as an improved K-means method. The improved K-means method and other three clustering methods are evaluated by a real dataset. Performance of these methods is assessed by a prediction accuracy analysis through known gene annotations. Our results show that the improved K-means method outperforms other clustering methods. The performance of the improved K-means method is also tested by varying the tuning parameter of the combined dissimilarity measure. The results show that when the tuning parameter decreases, the performance increases. Finally, a framework of integration of various biological prior knowledge and gene expression data is proposed.
protein-protein interactions gene ezpression data clustering data fusion
Gangguo Li Zhengzhi Wang
Institute of Automation, National University of Defense Technology, Changsha 410073, Hunan, China
国际会议
上海
英文
207-210
2008-05-16(万方平台首次上网日期,不代表论文的发表时间)