Random Walk Based Global Feature for Disease Gene Identification
Disease gene identification is of great significance for the treatment of genetic disorders.In recent years,the rapid development of high-throughput sequencing technologies has brought great revolution for disease gene identification methods.Network-based methods are now the most efficient component for disease gene identification,while the most of current methods pay only attention to the local topological attributes regardless of the global distribution.In this paper,we proposed to apply the random walk algorithm to extract global features for each gene and finally used binary logistic regression model to identify whether a gene belongs to the given disease.We also integrate the local features and global features into a complex feature vector to improve the identification performance.The experimental results show that the global feature is of great efficiency for disease gene identification.We organize the global feature into different kinds of feature vectors and we can get higher AUC scores than other state-of-the-art methods for all these feature vectors.
Gene identification Logistic regression Global features Disease gene
Lezhen Wei Shuai Wu Jian Zhang Yong Xu
Shen Zhen Graduate School,Harbin Institute of Technology,Shenzhen,China
国际会议
第七届全国模式识别学术会议(The 7th Chinese Conference on Pattern Recognition,CCPR2016)
成都
英文
464-473
2016-11-03(万方平台首次上网日期,不代表论文的发表时间)