Distinguishing enzymes from non-enzymes via support vector machine
With many proteins sequenced, the ability of predicting protein function from sequence is becoming more and more important. Currently, methods for inference of the protein functional annotation are mostly based on identifying a known function protein which is similar to the query protein. However, for the proteins that are dissimilar or only similar to the unknown proteins, these methods will lose effectiveness. In this paper, we propose a new method for distinguishing enzymes from non-enzymes without similarity search. We use conjoint triad feature, secondary-structure content and surface pocket properties to describe 1178 high-resolution proteins, and apply support vector machine approach to assign these described proteins class. With 10-fold cross-validation, the accuracy of predicting functional class of enzymes and non-enzymes is about 85.19%. Moreover,by choosing the informative features, the accuracy can be improved to 86.31%. These results suggest that this newly sequence-based method can be used to discover the other functional class membership of proteins.
Protein function Functional class of enzymes and non-enzymes Support vector classification Feature vectors
Yongcui Wang Yingjie Tian Naiyang Deng
College of Science, China Agricultural University, Beijing, China, 100083 Research Center on Fictitious Economy & Data Science Chinese Academy of Sciences, Beijing, China, 10
国际会议
The Second International Symposium(OSB08)(第二届国际优化及系统生物学学术会议)
云南丽江
英文
166-173
2008-10-31(万方平台首次上网日期,不代表论文的发表时间)