Validating clustering for gene ezpression data based on semantic distance of gene ontology terms
Clustering algorithms for gene expression data attempt to partition the gene expression data into groups, which exhibits similar patterns of variation in expression level. Many clustering algorithms have been proposed, but little guidance is available to evaluate the clustering result from biological meaning. We developed a new algorithm to measure semantic distance between Gene Ontology (GO) terms. Based on this algorithm, we proposed a novel method to assess the biological predictive power of the clustering algorithms: among a cluster, the more similar the functions of genes are, the lower the semantic distance is. We applied the approach to evaluating hierarchical clustering algorithms for yeast cell and diabetes datasets, and successfully obtained the biological features of the gene clusters. We found the approach may contribute to achieve better clustering results.
gene cluster validate GO semantic distance
Feizhen Wu Wenli Ma Mei Wang Qilong Chen Wenling Zheng
Bioelectornic Center, Shanghai University, ShangHai, Peoples R China Bioelectornic Center, Shanghai University, ShangHai, Peoples R China Inst Gent Engn, Southern Medica Fudan University, Huashan Hosp, Dept endocrinology, Shanghai, Peoples R China
国际会议
上海
英文
706-709
2008-05-16(万方平台首次上网日期,不代表论文的发表时间)