A Novel Ensemble Approach to Prediction of Protein Subcellular Location
Much attention has been paid to the technically research and practical application of prediction of protein subcellular location since a great number of previous works by researchers proved the close relationship between protein function and its location as well as human genome project successfully completed over last decades. With rapid progress of computers calculating speed, computational intelligence method dominates in the prediction of protein subcellular location. In our study, we chose pseudo amino acid (PseAA) model to extract features from protein primitive sequence as the input of classifier. Based on evolutionary fuzzy k-nearest neighbor algorithm (EFKNN), we trained and established six base classifiers with adopting totally different A-values that play an important role in the procedure of training and classifying. In accordance with the outputs of the six base classifiers, a novel ensemble approach named accumulative vote quantity (AVQ) to integrating each output is proposed. For the sake of verifying the effectiveness of our proposed method, we adopted benchmark dataset constructed by Jennifer L. Gardy and Fiona S.L. Brinkman in 2006 as training set whose five subcellular locations were taken from gram-negative bacterial. Simulating test by jackknife test results on dataset is 80.0%, which indicates that our proposed method can be considered to be a powerful prediction tool, or, to some extent, give complementary part to present prediction method.
protein subcellular location evolutionary fuzzy KNN pseudo amino acid composition ensemble learning accumulative vote quantity jackknife test
Chen Yue-hui Liu Li-yuan Ma Bing-xian
school of information science and engineering University of Jinan, Jinan, China
国际会议
太原
英文
544-547
2010-10-22(万方平台首次上网日期,不代表论文的发表时间)