会议专题

Integrating Multiple Gene Semantic Similarity Profiles to Infer Disease Genes

The inference of genes that are associated with human inherited diseases (disease genes) has been a task of great challenging in biological and medical studies. Many computational methods have been proposed to prioritize candidate genes with the use of a variety of genomic information. In this work, we propose a novel perspective of binary classification for the inference of disease genes. We integrate three semantic similarity profiles of human genes, a phenotype similarity profile of human diseases, and known associations between diseases and genes to obtain three numerical features that indicate the relevance between a given disease-gene pair. With the features, we use three classification methods (the logistic regression, the random forest, and the support vector machine) to predict whether a gene is truly associated with a disease or not. We apply 10-fold cross-validation experiments to assess the performance of the proposed method and show the effectiveness of this approach. We further show that this binary classification formulation can also be used to address the problem of prioritizing candidate genes.

Disease genes prediction prioritization gene semantic similarity phenotype similarity

HE Peng JIANG Rui

MOE Key Laboratory of Bioinfomatics and Bioinfomatics Division, TNLIST/Department of Automation, Tsi MOE Key Laboratory of Bioinfomatics and Bioinfomatics Division, TNLIST/Department of Automation,Tsin

国际会议

The 31st Chinese Control Conference(第三十一届中国控制会议)

合肥

英文

7420-7425

2012-07-01(万方平台首次上网日期,不代表论文的发表时间)