会议专题

Protein Function Prediction Using Kernal Logistic Regresssion with ROC Curves

To avoid the over-fitting problem in protein function prediction based on protein-protein interactions (PPI), we propose a pattern recognition strategy that all the features of PPI observation data are divided into three sets, training set, learning set and testing set. The employed classifiers are trained on training sets, the receiver operating characteristic (ROC) curve and optimal operating point (OOP) is calculated on learning set, and the accuracy rate is reported on the testing set with OOP. Under this framework, we compare the performances of logistic regression (LR) model with kernel logistic regression (KLR) model on two different feature selection sets, 1-order feature and 2-order feature according to PPI data. The experiment results on a standard PPI data show that KLR model performs better than LR model on training sets of both 1-order feature set and 2-order feature set, and the 2-order feature outperforms 1-order feature set with KLR model on training set. The predictive rates on testing set of both 1-order feature and 2-order feature with LR and KLR can achieve 95%.

protein-protein interaction logistic regression kernel logistic regression receiver operating characteristic optimal operating point

Jingwei Liu Minping Qian

School of Mathematics and System Sciences Beihang University LMIB of the Ministry of Education Beiji LMAM, School of Mathematical Sciences & Center for Theoretical Biology Peking University Beijing, 10

国际会议

2010 International Conference on Bio-inspried System and Signal Processing(2010 IEEE生物系统与信号处理国际会议 ICBSSP 2010)

厦门

英文

139-143

2010-10-26(万方平台首次上网日期,不代表论文的发表时间)