会议专题

Identification of DNA-binding residues of a protein from its primary sequence

Identification of DNA-binding residues in protein has made important function in several areas such as posttranscriptional regulation and protein function. In our work, we propose a method which combines a novel hybrid feature with the random forest (RF) algorithm to predict DNA-binding residues in protein sequences. The hybrid feature contains the second structure feature;predicted solvent accessibility and novel feature which including evolutionary information combining physicochemical properties. Furthermore, performance comparison of each feature indicates that the novel feature contributes most to the prediction improvement. The result demonstrates that our model achieves a value of 0.7238 for Matthews correlation coefficient (MCC) and 92.67% overall accuracy (ACC) with a 78.96% sensitivity (SE) and 94.56% specificity (SP), respectively. It is clearly that the prediction model has significant better prediction performance of DNA-binding sites in proteins.

Random forest RNA-binding residues Position specific scoring matrix

Xin Ma Lefu Hu

Golden Audit College Nanjing Audit University Nanjing 210029, P. R. China Physical Education Department Nanjing Audit University Nanjing 210029, P. R. China

国际会议

2012 Fifth International Symposium on Computational Intelligence and Design 第五届计算智能与设计国际会议 ISCID 2012

杭州

英文

290-293

2012-10-28(万方平台首次上网日期,不代表论文的发表时间)